英伟达GPU Driver Profiler Engineer

社招全职2025-09-01地点：上海状态：招聘

扫码手机上打开

任职要求

• B.S. EE/CS or equivalent experience with 2+ years of experience or M.S. with 1+ years' experience, or Ph.D.
• Strong programming ability in C, C++, and scripting languages.
• Quick learner, willing to dive in where needed and debug complex code and UMD/KMD interactions
• Driver experience (preferably kernel driver…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

• Revising/updating/testing kernel interfaces and reviewing code used by the Developer Tools team
• Collect requirements from software developer tools' features and work with the kernel team to co-design new interfaces
• Implementation of new features as well as HAL to support new GPU architectures
• Support various OS's and driver architectures: Windows WDDM, Linux Desktop, Mobile Linux and QNX.
• Contribute to next-gen architectures (both SW and HW)

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

C+

内核+

CUDA+

OpenCL+

还有更多 •••

登录查看完整学习资料

相关职位

AI Software Development Engineer

社招 Enginee

THE ROLE: Triton is a language and compiler for writing highly efficient custom deep learning primitives. It's widely adopted in open AI software stack projects like PyTorch, vLLM, SGLang, and many others. AMD GPU is an official backend in Triton and we are fully committed to it. If you are interested in making GPUs running fast via developing the Triton compiler and kernels, please come join us!

更新于 2025-10-06上海

Senior AI Training Performance Engineer

社招

N/A

更新于 2025-08-29上海

Performance Engineer Intern, Deep Learning and HPC - 2025

实习

We are now looking for a Performance Engineer Intern to support our growing investments in perf testing of various company datacenter products and applications. Today, NVIDIA is tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world, all while striving to deliver the highest possible performance of our products.You will be part of global Performance Lab team, improving our capacity to expertly and accurately benchmark state-of-the-art datacenter applications and products. We also work to develop new scripts that enhance the team’s ability to gather data through automation and designing efficient processes for testing a wide variety of applications and hardware. The data that we collect drives marketing/sales collaterals as well as engineering studies for current and future products. You will have the opportunity to work with multi-functional teams and in a dynamic environment where multiple projects will be active at once and priorities may shift frequently. What you’ll be doing: • Benchmark, profile, and analyze the performance of AI workloads specifically tailored for large-scale LLM training and inference, as well as High-Performance Computing (HPC) on NVIDIA supercomputers and distributed systems. • Aggregate and produce written and visual reports with the testing data for internal sales, marketing, SW, and HW teams • Setup and configure systems with appropriate hardware and software to run benchmarks • Collaborate with internal teams to debug and improve performance issues • Develop Python scripts to automate the testing of various applications • Assist with the development of tools and processes that improve our ability to perform automated testing

更新于 2025-10-13上海

Performance Engineer Intern, Deep Learning and HPC - 2026

实习

We are now looking for a Performance Engineer Intern to support our growing investments in perf testing of various company datacenter products and applications. Today, NVIDIA is tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world, all while striving to deliver the highest possible performance of our products.You will be part of global Performance Lab team, improving our capacity to expertly and accurately benchmark state-of-the-art datacenter applications and products. We also work to develop infrastructures and solutions that enhance the team’s ability to gather data through automation and designing efficient processes for testing a wide variety of applications and hardware. The data that we collect drives marketing/sales collaterals as well as engineering studies for future products. You will have the opportunity to work with multi-functional teams and in a dynamic environment where multiple projects will be active at once and priorities may shift frequently. What you’ll be doing: • Benchmark, profile, and analyze the performance of AI workloads specifically tailored for large-scale LLM training and inference, as well as High-Performance Computing (HPC) on NVIDIA supercomputers and distributed systems. • Aggregate and produce written reports with the testing data for internal sales, marketing, SW, and HW teams. • Develop Python scripts to automate the testing of various applications. • Collaborate with internal teams to debug and improve performance issues. • Assist with the development of tools and processes that improve our ability to perform automated testing. • Setup and configure systems with appropriate hardware and software to run benchmarks.

更新于 2025-11-19上海