logo of nvidia

英伟达Deep Learning Performance Architect - New College Grad 2026

社招全职地点:上海 | 北京状态:招聘

任职要求


• Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
• SW Agile skills helpful
• Excellent C/C++ programming and software design skills
• Pyth…
登录查看完整任职要求
微信扫码,1秒登录

工作职责


We are now looking for a
 
Deep Learning
 
Performance Architect-New College Grad
We are expanding our research and development for Inference. We seek excellent Software Engineers and Senior Software Engineers to join our team. We specialize in developing GPU-accelerated Deep learning software. Researchers around the world are using NVIDIA GPUs to power a revolution in deep learning, enabling breakthroughs in numerous areas. Join the team that builds software to enable new solutions. Collaborate with the deep learning community to implement the latest algorithms for public release in Tensor-RT. Your ability to work in a fast-paced customer-oriented team is required and excellent communication skills are necessary. 
What you’ll be doing:
• Develop highly optimized deep learning kernels for inference
• Do performance optimization, analysis, and tuning
• Work with cross-collaborative teams across automotive, image understanding, and speech understanding to develop innovative solutions
包括英文材料
C+
Python+
还有更多 •••
相关职位

logo of nvidia
社招

NVIDIA is developing processor and system architectures that accelerate deep learning and high-performance computing applications. We are looking for an expert deep learning system performance architect to join our AI performance modelling, analysis and optimization efforts. In this position, you will have a chance to work on DL performance modelling, analysis, and optimization on state-of-the-art hardware architectures for various LLM workloads. You will make your contributions to our dynamic technology focused company. What you’ll be doing: • Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products. • Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency. • Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations. • Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

更新于 2025-11-19上海|北京
logo of nvidia
社招

• Analysis possible use case scenarios.   • Define the next generation of MMPLEX IP architecture.   • Build algorithm/functional/performance/power models for MMPLEX IP • Prototyping Software/Firmware development   • Various of Validation and/or Verification of hardware build

更新于 2025-09-29上海
logo of nvidia
社招

• Develop production-quality software that ships as part of NVIDIA's AI software stack, including optimized large language model (LLM) support. • Analyze the performance of important workloads, tuning our current software, and proposing improvements for future software. • Work with cross-collaborative teams of deep learning software engineers and GPU architects to innovate across applications like generative AI, autonomous driving, computer vision, and recommender systems. • Adapt to the constantly evolving AI industry by being agile and excited to contribute across the codebase, including API design, software architecture, performance modeling, testing, and GPU kernel development.

更新于 2025-11-03上海
logo of nvidia
社招

• Analyze brand-new DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next-gen inference products. • Develop prototypes of the fastest kernels on present and future NVIDIA GPUs. • Define hardware and software setups along with measurements to evaluate performance, power consumption, and accuracy in current and upcoming chips. • Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

更新于 2025-12-02上海