英伟达Senior AI Infrastructure Software Engineer

社招全职2026-01-19地点：上海状态：招聘

扫码手机上打开

任职要求

• Master or PhD or equivalent experience in Computer Science or related field, with a minimum of 5 years in large-scale distributed systems or AI infrastructure. 
• Advanced expertise in Python (required), strong experience with JavaScript, and deep knowledge of software engineering principles, OOP/functional programming, and writing high-performance, maintainable code. 
• Demonstrated expertise in crafting scalable microservices, web apps, SQL, and NoSQL databases (especially MongoDB and Redis) in production with containers, Kubernetes, and CI/CD.  
• Solid experience with distributed messaging systems (e.g., Kafka), and integrating event-driven or decoupled architectures into robust enterprise solutions. 
• Practical exp…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

• Design, develop, and improve scalable infrastructure to support the next generation of AI applications, including copilots and agentic tools. 
• Drive improvements in architecture, performance, and reliability, enabling teams to bring to bear LLMs and advanced agent frameworks at scale. 
• Collaborate across hardware, software, and research teams, mentoring and supporting peers while encouraging best engineering practices and a culture of technical excellence. 
• Stay informed of the latest advancements in AI infrastructure and contribute to continuous innovation across the organization.

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

Python+

JavaScript+

面向对象+

Web+

SQL+

NoSQL+

CI+

MongoDB+

还有更多 •••

登录查看完整学习资料

相关职位

Senior DGX Cloud AI Infrastructure Software Engineer

社招

Joining NVIDIA's DGX Cloud Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on optimizing efficiency and resiliency of AI workloads, as well as developing scalable AI and Data infrastructure tools and services. Our objective is to deliver a stable, scalable environment for AI researchers, providing them with the necessary resources and scale to foster innovation. We are seeking an AI infrastructure software engineer to join our team. You'll be instrumental in designing, building, and maintaining AI infrastructure that enable large-scale AI training and inferencing. The responsibilities include implementing software and systems engineering practices to ensure high efficiency and availability of AI systems.As a senior DGX Cloud AI Infrastructure software engineer at NVIDIA, you will have the opportunity to work on innovative technologies that power the future of AI and data science, and be part of a dynamic and supportive team that values learning and growth. The role provides the autonomy to work on meaningful projects with the support and mentorship needed to succeed, and contributes to a culture of blameless postmortems, iterative improvement, and risk-taking. If you are seeking an exciting and rewarding career that makes a difference, we invite you to apply now! What you’ll be doing: • Develop infrastructure software and tools for large-scale AI, LLM, and GenAI infrastructure. • Develop and optimize tools to improve infrastructure efficiency and resiliency. • Root cause and analyze and triage failures from the application level to the hardware level • Enhance infrastructure and products underpinning NVIDIA's AI platforms. • Co-design and implement APIs for integration with NVIDIA's resiliency stacks. • Define meaningful and actionable reliability metrics to track and improve system and service reliability. • Skilled in problem-solving, root cause analysis, and optimization.

更新于 2025-10-07上海

Senior DGX Cloud AI Infrastructure Software Engineer

社招

• Work with the DGX Cloud Lepton Marketplace team to establish integrations with NVIDIA Cloud Partners, enabling global developers to easily access GPU-optimized virtual machines. • You will craft and implement IaaS API integrations, collaborating with external engineering teams to ensure reliable, scalable, and consistent connectivity across diverse cloud environments. • Shape integration strategies, develop stateful workflow orchestration, and drive improvements in testing, observability, and automation to ensure high-quality, fault-tolerant solutions. • Be responsible for developing the two-sided marketplace, including the integration of compute providers and crafting discovery and bidding experiences to match supply with demand.

更新于 2025-11-11上海

Senior DevOps Engineer - AI and AV Infrastructure

社招

N/A

更新于 2025-09-24上海|北京|深圳

Senior AI Training Performance Engineer

社招

N/A

更新于 2026-06-24上海