logo of nvidia

英伟达Systems Infrastructure Software Engineer

社招全职地点:上海状态:招聘

任职要求


The NVIDIA Infrastructure Group is seeking world-class programmers to design, implement, and debug the next generation of large-scale, general-purpose graphics and computing chips. In this role, you will help build the core verification infrastructure that drives the development of our GPU and Tegra chips.This strongly object-oriented C++ and Python infrastructure encompasses several extensive applications that allow us to efficiently verify the world's largest chips with a sophisticated distributed computing execution and triage environment. Come and join our diverse, international, fast-paced team with high production-quality standards.
What You’ll Be Doing:
• Developing environments to program and test next-generation GPU and SoC features well before they are integrated into products or supported by driver software. Every day brings new and meaningful challenges.
• Collaborating with colleagues across architecture, hardware, and software teams to unlock the functionality and performance of next-generation NVIDIA chips.
• Participating in the full chip development lifecycle—from architectural specification through release to production.

What We Need to See:
• BS or MS (or equivalent experience) in CS, CE, EE, or a related field.
• Strong proficiency in C++ programming.
• 3+ years working experience
• Familiarity with compute architecture, OS design, or driver development (a strong plus).
• Python programming experience (a plus).
• Excellent communication skills.
NVIDIA offers highly competitive salaries and a comprehensive benefits package. You’ll work alongside some of the most brilliant and talented engineers in the world. With our unprecedented growth, our extraordinary engineering teams are expanding rapidly. If you are a creative, autonomous engineer with a genuine passion for technology, we want to hear from you.

工作职责


N/A
包括英文材料
C+
Python+
SOC+
相关职位

logo of nvidia
社招

• Design, develop, and improve scalable infrastructure to support the next generation of AI applications, including copilots and agentic tools.  • Drive improvements in architecture, performance, and reliability, enabling teams to bring to bear LLMs and advanced agent frameworks at scale.  • Collaborate across hardware, software, and research teams, mentoring and supporting peers while encouraging best engineering practices and a culture of technical excellence.  • Stay informed of the latest advancements in AI infrastructure and contribute to continuous innovation across the organization.

更新于 2025-09-16
logo of nvidia
社招

• Architect, develop, and maintain Python-based tools and services to efficiently run a performance-focused multi-tenant Linux cluster including embedded, desktop, and server systems • Work with industry standard tools (Kubernetes, Slurm, Ansible, Gitlab, Artifactory, Jira) • Actively support users doing development, functional testing, and performance testing on current and pre-production GPU cluster systems • Work with various teams at NVIDIA across different timezones to incorporate and influence the latest tools for operating GPU clusters • Collaborate with users and system administrators to seek out ways to improve UX and operational efficiency • Become an expert on the entire AI infrastructure stack

更新于 2025-10-17
logo of nvidia
社招

• Designing and developing software for testing and analysis of our codebases • Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries • Developing throughout the software stack, from the user experience and user interfaces down to the cluster and database layers • Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Gitlab, Jira, etc.) • Develop front-end solutions using HTML, CSS, JavaScript, and related web technologies • Advancing the state of the art in those industry-standard tools

更新于 2025-06-27
logo of nvidia
社招

• Designing and developing software for testing and analysis of our codebases • Building scalable automation for build, test, integration, and release processes for publicly distributed deep learning libraries • Developing throughout the software stack, from the user experience down to the cluster and database layers • Configuring, maintaining, and building upon deployments of industry-standard tools (e.g. Kubernetes, Jenkins, Docker, CMake, Github, Gitlab, Jira, etc) • Advancing state of the art in those industry-standard tools

更新于 2025-05-22