亚马逊Data engineer, AIT
任职要求
基本任职资格 - 1+ years of data engineering experience - Experience with data modeling, warehousing and building ETL pipelines - Experience with one or more query language (e.g., SQL, PL/SQL, DDL, MDX, HiveQL, SparkSQL, Scala) - Experience with one or more scripting language (e.g., Python, KornShell) 优先任职资格 - Experience with big data technologies such as: Hadoop, Hive, Spark, EMR - Experience with any ETL tool like, Informatica, ODI, SSIS, BODI, Datastage, etc. - Experience with AWS technologies like Redshift, S3, AWS G…
工作职责
• Design and implement end-to-end data pipelines (ETL) to ensure efficient data collection, cleansing, transformation, and storage, supporting both real-time and offline analytics needs. • Develop automated data monitoring tools and interactive dashboards to enhance business teams’ insights into core metrics (e.g., user behavior, AI model performance). • Collaborate with cross-functional teams (e.g., Product, Operations, Tech) to align data logic, integrate multi-source data (e.g., user behavior, transaction logs, AI outputs), and build a unified data layer. • Establish data standardization and governance policies to ensure consistency, accuracy, and compliance. • Provide structured data inputs for AI model training and inference (e.g., LLM applications, recommendation systems), optimizing feature engineering workflows.
• Design and implement end-to-end data pipelines (ETL) to ensure efficient data collection, cleansing, transformation, and storage, supporting both real-time and offline analytics needs. • Develop automated data monitoring tools and interactive dashboards to enhance business teams’ insights into core metrics (e.g., user behavior, AI model performance). • Collaborate with cross-functional teams (e.g., Product, Operations, Tech) to align data logic, integrate multi-source data (e.g., user behavior, transaction logs, AI outputs), and build a unified data layer. • Establish data standardization and governance policies to ensure consistency, accuracy, and compliance. • Provide structured data inputs for AI model training and inference (e.g., LLM applications, recommendation systems), optimizing feature engineering workflows. • Explore innovative AI-data integration use cases (e.g., embedding AI-generated insights into BI tools). • Provide technical guidance and best practice on data architecture that meets both traditional reporting purpose and modern AI Agent requirements.
• Collaborate with BIE,DE, PM, CSM to research, design, develop, and evaluate generative AI solutions to address Global Selling challenges. • Interact with stakeholders directly to understand their business problems, aid them in implementation of generative AI solutions, brief stkaholders and guide them on adoption patterns and paths to production • Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder
Design and build cloud-based data warehouses to deliver efficient analytical and reporting capabilities for Apple’s global and regional sales and finance teams. Develop highly scalable data pipelines to ingest and process data from multiple source systems, leveraging Apache Airflow for workflow orchestration, scheduling, and monitoring. Architect generic, reusable solutions that enforce to data warehousing best practices while addressing complex business requirements. Analyze and optimize existing systems, providing improvements and ongoing support as needed. Uphold the highest standards of data integrity and software quality, ensuring reliable and accurate outputs. We are looking for a proactive self-starter who takes initiative, learns fast, and works well across teams. Join our growing team where no two days are the same - solving tough technical challenges and business problems in a fast-paced environment.
- Work with global teams to enable reliable data for GCR business operations while following strict security compliance requirements, and build data foundation for GCR users to self-service for their use cases - Build and enhance data platforms to support end users to easily and securely access data and insights by leveraging AI, AWS services, and open source services - Collaborate with product managers and SDE team members to design and implement data products that meet business requirements and deliver measurable value - Implement robust data quality monitoring, validation frameworks, and governance practices while optimizing compute solutions for performance and cost efficiency