英伟达GPU Power Analysis Intern - 2026
任职要求
• Pursuing MS or PhD in related fields. • Basic understanding of concepts of energy consumption, estimation, and low power design. • Familiarity with Verilog and ASIC design principles, including knowledge of logic cells. • Good verbal/written English and interpersonal skills; much collaboration with design teams is expected. • Strong coding skills, preferably in Python, C++. • Ability to formulate and analyze algorithms, and comment on their tim…
工作职责
• Use internally developed tools and industry standard pre-silicon gate-level and RTL power analysis tools, to help improve product power efficiency. • Develop and share best practices for performing pre-silicon power analysis, Enhance internal power tools and automate best practices • Perform comparative power analysis, to spot trends and anomalies, that warrant more scrutiny. • Interact with architects and RTL designers to help them interpret their power data and identify power bugs; drive them to implement fixes. • Select and run a wide variety of workloads for power analysis, Collaborate with performance and architecture teams to validate performance of the workloads • Prototype a new architectural feature in Verilog and analyze power.
• Work proactively on GPU/SOC feature/IP bring-up and characterization including creating validation, tuning and optimization methodologies, develop proper test plan and test cases. • Perform system level use case analysis/profiling and feature return-on-investment investigations, prototype and validation, design methods and control policies to bring the new technology into production and transcend product goals. • Collaborate across System Architecture, DFT, ASIC, SW/FW, platform, validation, and production teams throughout the product life cycle on system-level architecture, design, productization, debugging, and deployment for complex silicon designs to improve quality, safety, and manufacturability. • Design tools/script to automate product definitions, data collection, test case execution, and results analysis. Provide detailed data analysis of functionality, performance, and latency. • Hands on actions on silicon bring-up, validation, and debug; Coordinate product level feature deployment to achieve high product quality at aggressive schedule.
• Build internal profiling/analysis tools for real world application perf/power analysis at system from small to large scale. • Build infrastructure or services for data visualization/mining and management. • Work with our users to build their perf/power models on top of our tools for next generation HW design.
• Design tools to automate chip different features validation, data collection, test case execution, and results analysis. • Collaborate closely with silicon solution team, feature design team, software team to collect/discuss the tools requirement to cover the qualification to avoid the problem. • Work closely with silicon solution team, board team, and feature design team for the firmware functions development/validation.
NVIDIA is developing processor and system architectures that accelerate deep learning and high-performance computing applications. We are looking for an intern deep learning system performance architect to join our AI performance modelling, analysis and optimization efforts. In this position, you will have a chance to work on DL performance modelling, analysis, and optimization on state-of-the-art hardware architectures for various LLM workloads. You will make your contributions to our dynamic technology focused company. What you’ll be doing: • Analyze state of the art DL networks (LLM etc.), identify and prototype performance opportunities to influence SW and Architecture team for NVIDIA's current and next gen inference products. • Develop analytical models for the state of the art deep learning networks and algorithm to innovate processor and system architectures design for performance and efficiency. • Specify hardware/software configurations and metrics to analyze performance, power, and accuracy in existing and future uni-processor and multiprocessor configurations. • Collaborate across the company to guide the direction of next-gen deep learning HW/SW by working with architecture, software, and product teams.