logo of amazon

亚马逊Principal AI Specialist Solution Architect - Infrastructure

社招全职Solution Architect地点:上海 | 北京 | 深圳状态:招聘

任职要求


基本任职资格
- 5+ years of hands-on experience optimizing AI infrastructure, with deep expertise in inference acceleration frameworks (e.g., vLLM, SGLang, TensorRT, etc.), model training and serving systems across PyTorch and TensorFlow ecosystems;
- Advanced proficiency in Nvidia GPU performance optimization techniques, including memory management, kernel fusion, and quantization strategies for large-scale deep learning workloads;
- Strong foundation in parallel computing principles with practical CUDA programming experience, emphasizing efficient resource utilization and throughput maximization;
- Demonstrated success implementing and tuning distributed AI systems leveraging modern frameworks like Megatron-LM and Ray, with particular focus on LLM deployment and horizontal scaling across GPU clusters.

优先任职资格
- First hand implementation experience of AI Infrastructure operation & optimization (Nvidia GPU Chips, Frameworks, Servers, Networking, Power etc.).
- Graduate degree in a highly quantitative field (Computer Science, Machine Learning, Operations Research, Statistics, Mathematics, etc.);
- Proficiency in performance optimization on Amazon Trainiums;
- Proficiency in kernel programming for accelerated hardware using programming models such as (but not limited to) CUDA
- Solid end-to-end hands-on development experience of deep learning algorithms related to Transformers;
- Experience with patents or publications at top-tier peer-reviewed conferences or journals.
- Past and current experience writing and speaking about complex technical concepts to broad audiences in a simplified format; strong written and verbal communication skills;

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

工作职责


- As an AIML Specialist Solutions Architect (SA) in AI Infrastructure, you will serve as the Subject Matter Expert (SME) for providing optimal solutions in model training and inference workloads that leverage Amazon Web Services accelerator computing services. As part of the Specialist Solutions Architecture team, you will work closely with other Specialist SAs to enable large-scale customer model workloads and drive the adoption of AWS EC2, EKS, ECS, SageMaker and other computing platform for GenAI practice. 
- You will interact with other SAs in the field, providing guidance on their customer engagements, and you will develop white papers, blogs, reference implementations, and presentations to enable customers and partners to fully leverage AI Infrastructure on Amazon Web Services. You will also create field enablement materials for the broader SA population, to help them understand how to integrate Amazon Web Services GenAI solutions into customer architectures.
- You must have deep technical experience working with technologies related to Large Language Model (LLM), Stable Diffusion and many other SOTA model architectures, from model designing, fine-tuning, distributed training to inference acceleration. A strong developing machine learning background is preferred, in addition to experience building application and architecture design. You will be familiar with the ecosystem of Nvidia and related technical options, and will leverage this knowledge to help Amazon Web Services customers in their selection process.
- Candidates must have great communication skills and be very technical and hands-on, with the ability to impress Amazon Web Services customers at any level, from ML engineers to executives. Previous experience with Amazon Web Services is desired but not required, provided you have experience building large scale solutions. You will get the opportunity to work directly with senior engineers at customers, partners and Amazon Web Services service teams, influencing their roadmaps and driving innovations.
包括英文材料
vLLM+
SGLang+
TensorRT+
PyTorch+
TensorFlow+
内核+
CUDA+
Megatron+
Ray+
大模型+
相关职位

logo of nvidia
社招

• Develop and implement strategies to optimize AI model inference for on-device deployment. • Employ techniques like pruning, quantization, and knowledge distillation to minimize model size and computational demands. • Optimize performance-critical components using CUDA and C++. • Collaborate with multi-functional teams to align optimization efforts with hardware capabilities and deployment needs. • Benchmark inference performance, identify bottlenecks, and implement solutions. • Research and apply innovative methods for inference optimization. • Adapt models for diverse hardware platforms and operating systems with varying capabilities. • Create tools to validate the accuracy and latency of deployed models at scale with minimal friction. • Recommend and implement model architecture changes to improve the accuracy-latency balance.

更新于 2025-10-15
logo of microsoft
社招Research

• Initiate and advance research to advance state-of-the-art in AI for Software Engineering • Collaborate across disciplines with product teams across Microsoft and Github • Stay up to date with the research literature and product advances in AI for software engineering • Collaborate with world renowned experts in programming tools and developer tools to integrate AI across software development stack for Copilot • Build and manage large-scale AI experiments and models.

更新于 2025-07-21
logo of microsoft
社招Software

Architect, build, and optimize secure and performant AI platform services that power Microsoft Copilot and other next-generation AI scenarios. Provide technical leadership across teams to define long-term architectural direction and drive engineering excellence. Collaborate with infrastructure, platform, product, and research teams to design and deliver scalable, production-grade AI services. Write high-quality, well-tested, secure, and maintainable code and promote high standards across the team. Tackle technically ambiguous or cross-boundary problems, remove roadblocks, and drive delivery across multiple teams or organizations. Lead technical design discussions, mentor senior engineers, and foster a strong engineering culture within the team. Embody Microsoft’s Culture and Values, and help shape the direction of the engineering team and broader organization.

更新于 2025-09-19
logo of microsoft
社招Software

Architect, build, and optimize secure and performant AI platform services that power Microsoft Copilot and other next-generation AI scenarios. Provide technical leadership across teams to define long-term architectural direction and drive engineering excellence. Collaborate with infrastructure, platform, product, and research teams to design and deliver scalable, production-grade AI services. Write high-quality, well-tested, secure, and maintainable code and promote high standards across the team. Tackle technically ambiguous or cross-boundary problems, remove roadblocks, and drive delivery across multiple teams or organizations. Lead technical design discussions, mentor senior engineers, and foster a strong engineering culture within the team. Embody Microsoft’s Culture and Values, and help shape the direction of the engineering team and broader organization.

更新于 2025-09-18