亚马逊Principal AI Specialist Solution Architect - Infrastructure

社招全职Solution Architect2025-07-18地点：上海 | 北京 | 深圳状态：招聘

扫码手机上打开

任职要求

基本任职资格
- 5+ years of hands-on experience optimizing AI infrastructure, with deep expertise in inference acceleration frameworks (e.g., vLLM, SGLang, TensorRT, etc.), model training and serving systems across PyTorch and TensorFlow ecosystems;
- Advanced proficiency in Nvidia GPU performance optimization techniques, including memory management, kernel fusion, and quantization strategies for large-scale deep learning workloads;
- Strong foundation in parallel computing principles with practical CUDA programming experience, emphasizing efficient resource utilization and throughput maximization;
- Demonstrated success implementing and tuning distributed AI systems leveraging modern frameworks like Megatron-LM and Ray, with particular focus on LLM deployment and horizontal scaling across GPU clusters.

优先任职资格
- First hand implementation experience of AI Infrastructure operation & optimization (Nvidia GPU Chips, Frameworks, Servers, Networking, Power etc.).
- Graduate degree in a highly quantitative field (Computer Science, Machine Learning, Oper…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

- As an AIML Specialist Solutions Architect (SA) in AI Infrastructure, you will serve as the Subject Matter Expert (SME) for providing optimal solutions in model training and inference workloads that leverage Amazon Web Services accelerator computing services. As part of the Specialist Solutions Architecture team, you will work closely with other Specialist SAs to enable large-scale customer model workloads and drive the adoption of AWS EC2, EKS, ECS, SageMaker and other computing platform for GenAI practice. 
- You will interact with other SAs in the field, providing guidance on their customer engagements, and you will develop white papers, blogs, reference implementations, and presentations to enable customers and partners to fully leverage AI Infrastructure on Amazon Web Services. You will also create field enablement materials for the broader SA population, to help them understand how to integrate Amazon Web Services GenAI solutions into customer architectures.
- You must have deep technical experience working with technologies related to Large Language Model (LLM), Stable Diffusion and many other SOTA model architectures, from model designing, fine-tuning, distributed training to inference acceleration. A strong developing machine learning background is preferred, in addition to experience building application and architecture design. You will be familiar with the ecosystem of Nvidia and related technical options, and will leverage this knowledge to help Amazon Web Services customers in their selection process.
- Candidates must have great communication skills and be very technical and hands-on, with the ability to impress Amazon Web Services customers at any level, from ML engineers to executives. Previous experience with Amazon Web Services is desired but not required, provided you have experience building large scale solutions. You will get the opportunity to work directly with senior engineers at customers, partners and Amazon Web Services service teams, influencing their roadmaps and driving innovations.

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

vLLM+

SGLang+

TensorRT+

PyTorch+

TensorFlow+

还有更多 •••

登录查看完整学习资料

相关职位

Principal Applied Scientist- Feeds and AI（停招）

社招Research

  • Algorithm Development and Enhancement for Content Quality in News & Feeds• Work with cross-functional teams to design, develop, and implement recommendation algorithms to deliver product features and drive user engagement.- Optimize existing recommendation algorithms by analyzing performance metrics and user feedback, incorporating advanced machine learning techniques including generative AI techniques. • Innovation in the area of NLP, LLM, and recommender system. • Data Analysis and Modeling• Perform data analysis to identify patterns, trends, and opportunities to improve the relevance and quality of our recommendation systems. • Build systemic solutions and models to optimize user experience.

更新于 2024-12-25北京

Principal On-Device Model Inference Optimization Engineer

社招

• Develop and implement strategies to optimize AI model inference for on-device deployment. • Employ techniques like pruning, quantization, and knowledge distillation to minimize model size and computational demands. • Optimize performance-critical components using CUDA and C++. • Collaborate with multi-functional teams to align optimization efforts with hardware capabilities and deployment needs. • Benchmark inference performance, identify bottlenecks, and implement solutions. • Research and apply innovative methods for inference optimization. • Adapt models for diverse hardware platforms and operating systems with varying capabilities. • Create tools to validate the accuracy and latency of deployed models at scale with minimal friction. • Recommend and implement model architecture changes to improve the accuracy-latency balance.

更新于 2025-10-15北京|上海

Principal Applied Science Manager

社招Research

• Initiate and advance research to advance state-of-the-art in AI for Software Engineering • Collaborate across disciplines with product teams across Microsoft and Github • Stay up to date with the research literature and product advances in AI for software engineering • Collaborate with world renowned experts in programming tools and developer tools to integrate AI across software development stack for Copilot • Build and manage large-scale AI experiments and models.

更新于 2025-10-22上海

Principal Software Engineer (MAI Copilot)

社招Software

Architect, build, and optimize secure and performant AI platform services that power Microsoft Copilot and other next-generation AI scenarios. Provide technical leadership across teams to define long-term architectural direction and drive engineering excellence. Collaborate with infrastructure, platform, product, and research teams to design and deliver scalable, production-grade AI services. Write high-quality, well-tested, secure, and maintainable code and promote high standards across the team. Tackle technically ambiguous or cross-boundary problems, remove roadblocks, and drive delivery across multiple teams or organizations. Lead technical design discussions, mentor senior engineers, and foster a strong engineering culture within the team. Embody Microsoft’s Culture and Values, and help shape the direction of the engineering team and broader organization.

更新于 2025-09-19北京