英伟达GPU C++ Modeling Engineer - New College Grad 2026
任职要求
• In depth knowledge of computer architecture, with good understanding of modern ISA and microprocessor implementation techniques. • Good understanding of GPU concept and pipeline, in terms of Graphics processing and parallel compute. • Good mastery of C++ language. • Experience of performance/functional modelling, profiling and analysis is a plus. • Experience with trace-driven and execution-driven simulation model development is a plus. • BSEE, BSCSE, or equivalent required. MS or PhD is a plus. Ways to stand out from the crowd: • Candidates with GPU shader unit design, microprocessor design, and CPU/GPU performance analysis experience are preferred.
工作职责
We are now looking for a GPU C++ Modeling Engineer – Performance/Functional Modeling, Validation and Analysis of Shader. TPC arch team is a fast-growing team which welcomes all level engineers to join. Our aim is to explore and design better architecture of GPU which will help AI program run efficiently and rendering in games become faster and more realistic. TPC is core of GPU. It includes several units for schedule, computation and cache. You will work will US team closely, including test writing, function and performance implementation of kinds of features and study of new features. Don’t worry about the importance of work. Don’t worry about heavy workload. Join us, grow faster. What you’ll be doing: • Investigate and propose architecture ideas based on quantitative study of existing and projected GPU architecture. • Develop performance and functional simulation models. • Develop performance and functional testplan and tests to validate new GPU architectural and features. • Test and debug on simulators, RTL and real silicon.
At NVIDIA, we pride ourselves in having energy efficient products. We believe that continuing to maintain our products' energy efficiency compared to the competition is key to our continued success. Our team is responsible for researching, developing, and deploying methodologies to help NVIDIA's products become more energy efficient; and is responsible for building energy models that integrate into architectural simulators, RTL simulation, and emulation platforms. Key responsibilities include developing techniques to model, analyze, and reduce the power consumption of NVIDIA GPUs. As a member of the Power Modeling, Methodology, and Analysis Team, you will collaborate with Architects, Performance Engineers, Software Engineers, ASIC Design Engineers, and Physical Design teams to study and implement energy modeling techniques for NVIDIA's next-generation GPUs and Tegra SOCs. Your contributions will help us gain early insight into the energy consumption of graphics and artificial intelligence workloads, and will allow us to influence architectural, design, and power management improvements. What you’ll be doing: • Work with architects and performance architects to develop an energy-efficient GPU. • Develop methodologies and workflows to select and run a wide variety of workloads to train models using ML and/or statistical techniques. • Develop methodologies to improve the accuracy of energy models under various constraints, such as, process, timing, floorplan and layout. • Correlate the predicted energy from models created at different stages of the design cycle, with the goal of bridging early estimates to silicon. • Develop tools to debug energy inefficiencies observed in various workloads run on silicon, RTL and architectural simulators. Work with architects to fix the identified energy inefficiencies. • Work with performance, verification and emulation methodology and infrastructure development teams to integrate energy models into their platforms. • Prototype new architectural features, create an energy model, and analyze the system impact.
We are now looking for a GPU C++ Modeling Engineer intern – Performance/Functional Modeling, Validation and Analysis of Shader. TPC arch team is a fast-growing team which welcomes all level engineers to join. Our aim is to explore and design better architecture of GPU which will help AI program run efficiently and rendering in games become faster and more realistic. TPC is core of GPU. It includes several units for schedule, computation and cache. You will work will US team closely, including test writing, function and performance implementation of kinds of features and study of new features. What you’ll be doing: • Investigate and propose architecture ideas based on quantitative study of existing and projected SM architecture. • Develop performance and functional simulation models. • Develop performance and functional testplan and tests to validate new SM architectural and features. • Test and debug on simulators, RTL and real silicon.
THE ROLE: This is a role in post-silicon power and performance engineering team who will act as end-to-end owner for System/APU/GPU performance optimization from first silicon to production. This role needs to have stronger data scientific sense which can learn and build power and performance data insight structure via well-using internal experiment data including the whole data analytic process from data collecting, clean, modelling , templating, insights, and visualization. Technically it requires the person to have a clear understand of key performance influence factors in silicon design and implementation, also understand how to collaborate with the critical program milestone and be responsible to drive across sites and across teams for performance related feature readiness and data analytics. THE PERSON: The person needs to be passionate and well self-motivated, sensitive to the delivery urgency and the innovation path, a good communicator with teamwork spirit, eager to learn new knowledge, able to resolve complex problem. The person needs to have a natural interest in mathematic algorithms, logical coding and beauty of data telling. The person needs to be enthusiasm on data analytic job. KEY RESPONSIBILITIES: Work closely with internal data producing team and external automation framework team to develop engineering experience-based power and performance data analytic process and visualization structure. Work closely with SOC design team & program lead team to understand performance target and validation methodology for APU/GPU Component and overall reference design System Develop system and component level performance test strategies and plans. Attend ASIC bring-up and validation, ensure coverage and schedule. Execute performance test plans for APU/GPU/System and give out the improvement suggestion. Compose validation reports and provide future test plan improvement. Work with CE team to understand Customer expectations of power and performance design and support debugging. Automate some of manual tests in Shell/Python or other scripting language.
We are now looking for a Deep Learning Performance Software Engineer!We are expanding our research and development for Inference. We seek excellent Software Engineers and Senior Software Engineers to join our team. We specialize in developing GPU-accelerated Deep learning software. Researchers around the world are using NVIDIA GPUs to power a revolution in deep learning, enabling breakthroughs in numerous areas. Join the team that builds software to enable new solutions. Collaborate with the deep learning community to implement the latest algorithms for public release in Tensor-RT. Your ability to work in a fast-paced customer-oriented team is required and excellent communication skills are necessary. What you’ll be doing: • Develop highly optimized deep learning kernels for inference • Do performance optimization, analysis, and tuning • Work with cross-collaborative teams across automotive, image understanding, and speech understanding to develop innovative solutions • Occasionally travel to conferences and customers for technical consultation and training