英伟达Deep Learning Performance Architect

社招全职2025-09-03地点：上海 | 北京状态：招聘

扫码手机上打开

任职要求

• BS, MS or PhD in relevant discipline (CS, EE, Math, etc.) or equivalent experience.
• 5+ years work experience.
• Experience with popular AI models (e.g., LLM and AIGC models)
• Be familiar with typical deep learning SW framework (e.g., Torch/JAX/TensorFlow/TensorRT)
• Knowledge and experience on hardware architectures for deep learning applications
With competitive salaries and a generous benefits package, we are widely considered to be one of the technology world’s most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to unprecedented growt…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

• Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products
• Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency.
• Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations.
• Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

大模型+

开发框架+

还有更多 •••

登录查看完整学习资料

相关职位

Deep Learning Performance Architect

社招

We are now looking for a Deep Learning Performance Software Engineer! We are expanding our research and development for Inference. We seek excellent Software Engineers and Senior Software Engineers to join our team.We specialize in developing GPU-accelerated Deep learning software. Researchers around the world are using NVIDIA GPUs to power a revolution in deep learning, enabling breakthroughs in numerous areas. Join the team that builds software to enable new solutions. Collaborate with the deep learning community to implement the latest algorithms for public release in Tensor-RT. Your ability to work in a fast-paced customer-oriented team is required and excellent communication skills are necessary. What you’ll be doing: • Develop highly optimized deep learning kernels for inference • Do performance optimization, analysis, and tuning • Work with cross-collaborative teams across automotive, image understanding, and speech understanding to develop innovative solutions • Occasionally travel to conferences and customers for technical consultation and training

更新于 2025-09-23上海|北京

Deep Learning Performance Architect - Perf Tools

社招

• Architect Performance Tooling: Develop infrastructure tools/libraries for GPU performance analysis, visualization, and automated workflows used across GPU SW/HW development life cycle. • Unlock Architectural Insights: Analyze GPU workloads to identify bottlenecks and define new hardware profiling features that enhance perf debug and profiling capabilities. • AI-Powered Automation: Build AI/ML-driven tools to automate performance analysis, generate perf optimization guidance, and improve user experience of profiling infrastructure. • Cross-Stack Collaboration: Partner with kernel developers, system software teams, and hardware architects to support performance study, improve CUDA software stack, and co-design performance-centric solutions for current and next-generation GPU architecture

更新于 2025-12-24上海|北京

Deep Learning Performance Architect

社招

NVIDIA is developing processors and system architectures that accelerate deep learning on edge devices, workstations and data center GPUs for a variety of applications, including automotive, robotics, large language models (LLMs) and AI generative models. We are looking for an expert deep learning system performance architect to join our modelling, efficiency optimization, performance projections and analysis effort. In this position, you will have the chance to optimize deep learning hardware and software architecture and make the significant impact in a dynamic technology focused company What you’ll be doing :• Analyze performance and efficiency of various machine learning/deep learning algorithms on different architectures • Identify architecture and software performance bottlenecks and propose optimizations • Explore new features and hardware capabilities on deep learning applications

更新于 2025-09-03上海

Deep Learning Performance Architect

社招

• Design and develop the architecture, interface and features of the GPU kernel library • Keep improving the quality and performance of the library and its GPU kernels • Explore and expand the boundary of innovative technologies like GPU code generation and fusion • Contribute to NVIDIA's AI business by collaborating closely with DL product teams as well as kernel development teams

更新于 2025-10-17上海|北京