logo of amazon

亚马逊Data Engineer, AWS GCR Tech

社招全职Data Engineering地点:北京状态:招聘

任职要求


基本任职资格
- 5+ years of data engineering experience
- Bachelor's degree
- Experience with data modeling, warehousing and building ETL pipelines
- Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets

优先任职资格
- Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
- Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
- Hands-on experience with GenAI technologies (LLMs, foundation models, prompt engineering)

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

工作职责


- Work with global teams to enable reliable data for GCR business operations while following strict security compliance requirements, and build data foundation for GCR users to self-service for their use cases
- Build and enhance data platforms to support end users to easily and securely access data and insights by leveraging AI, AWS services, and open source services
- Collaborate with product managers and SDE team members to design and implement data products that meet business requirements and deliver measurable value
- Implement robust data quality monitoring, validation frameworks, and governance practices while optimizing compute solutions for performance and cost efficiency
包括英文材料
ETL+
Python+
Java+
Scala+
Node.js+
AWS+
Redshift+
S3+
Prompt+
相关职位

logo of amazon
社招Professi

1. Generative AI Model Development: -Design and develop generative AI models, including language models, image generation models, and multimodal models. -Explore and implement advanced techniques in areas such as transformer architectures, attention mechanisms, and self-supervised learning. -Conduct research and stay up-to-date with the latest advancements in the field of generative AI. 2. Data Acquisition and Preprocessing: -Identify and acquire relevant data sources for training generative AI models. -Develop robust data preprocessing pipelines, ensuring data quality, cleanliness, and compliance with ethical and regulatory standards. -Implement techniques for data augmentation, denoising, and domain adaptation to enhance model performance. 3. Model Training and Optimization: -Design and implement efficient training pipelines for large-scale generative AI models. -Leverage distributed computing resources, such as GPUs and cloud platforms, for efficient model training. -Optimize model architectures, hyperparameters, and training strategies to achieve superior performance and generalization. 4. Model Evaluation and Deployment: -Develop comprehensive evaluation metrics and frameworks to assess the performance, safety, and bias of generative AI models. -Collaborate with cross-functional teams to ensure the successful deployment and integration of generative AI models into client solutions. 5. Collaboration and Knowledge Sharing: -Collaborate with data engineers, software engineers, and subject matter experts to develop innovative solutions leveraging generative AI. -Contribute to the firm's thought leadership by presenting at conferences, and participating in industry events.

更新于 2025-04-27
logo of amazon
社招Data Eng

• Design and implement end-to-end data pipelines (ETL) to ensure efficient data collection, cleansing, transformation, and storage, supporting both real-time and offline analytics needs. • Develop automated data monitoring tools and interactive dashboards to enhance business teams’ insights into core metrics (e.g., user behavior, AI model performance). • Collaborate with cross-functional teams (e.g., Product, Operations, Tech) to align data logic, integrate multi-source data (e.g., user behavior, transaction logs, AI outputs), and build a unified data layer. • Establish data standardization and governance policies to ensure consistency, accuracy, and compliance. • Provide structured data inputs for AI model training and inference (e.g., LLM applications, recommendation systems), optimizing feature engineering workflows. • Explore innovative AI-data integration use cases (e.g., embedding AI-generated insights into BI tools). • Provide technical guidance and best practice on data architecture and BI solution

更新于 2025-06-12
logo of amazon
社招Data Eng

1. Design, develop, and maintain scalable data pipelines to support ML model development and production deployment. 2. Implement and maintain CI/CD pipelines for the data and ML solutions. 3. Collaborate with data scientists and other team members to understand data requirements and implement efficient data processing solutions. 4. Create and manage data warehouses and data lakes, ensuring proper data governance and security measures are in place. 5. Collaborate with product managers and business stakeholders to understand data needs and translate them into technical requirements. 6. Stay current with emerging technologies and best practices in data engineering, and propose innovative solutions to improve data infrastructure and processes for ML models and analytics applications. 7. Participate in code reviews and contribute to the development of best practices for data engineering within the team.

更新于 2025-07-21
logo of apple
社招Machine

Design and build cloud-based data warehouses to deliver efficient analytical and reporting capabilities for Apple’s global and regional sales and finance teams. Develop highly scalable data pipelines to ingest and process data from multiple source systems, leveraging Apache Airflow for workflow orchestration, scheduling, and monitoring. Architect generic, reusable solutions that enforce to data warehousing best practices while addressing complex business requirements. Analyze and optimize existing systems, providing improvements and ongoing support as needed. Uphold the highest standards of data integrity and software quality, ensuring reliable and accurate outputs. We are looking for a proactive self-starter who takes initiative, learns fast, and works well across teams. Join our growing team where no two days are the same - solving tough technical challenges and business problems in a fast-paced environment.

更新于 2025-07-15