英伟达Senior AI Performance and Efficiency Engineer

社招全职2025-11-12地点：上海状态：招聘

扫码手机上打开

任职要求

• BS or similar background in Computer Science or related area (or equivalent experience) 
• Minimum 8+ years of experience designing and operating large scale compute infrastructure
• Strong understanding of modern ML techniques and tools 
• Experience investigating, and resolving, training & inference performance end to end
• Debugging and optimization experience with NSight Systems and NSight Compute
• Experience with debugging large-scale distributed training using NCCL
• Proficiency in programming & scripting languages such as Python, Go, Bash, as well as familiarity with cloud computing platforms (e.g., AWS, GCP, Azure) in addition to experience with parallel computing frameworks and paradigms.
• Dedication to ongoing learning and staying updated on new technologies and innovative methods in the AI/ML in…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

• Collaborate closely with our AI/ML researchers to make their ML models more efficient leading to significant productivity improvements and cost savings
• Build tools, frameworks, and apply ML techniques to detect & analyze efficiency bottlenecks and deliver productivity improvements for our researchers
• Work with researchers working on a variety of innovative ML workloads across Robotics, Autonomous vehicles, LLM’s, Videos and more
• Collaborate across the engineering organizations to deliver efficiency in our usage of hardware, software, and infrastructure 
• Proactively monitor fleet wide utilization patterns, analyze existing inefficiency patterns, or discover new patterns, and deliver scalable solutions to solve them
• Keep up to date with the most recent developments in AI/ML technologies, frameworks, and successful strategies, and advocate for their integration within the organization.

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

Nsight+

NCCL+

Python+

Go+

Bash+

还有更多 •••

登录查看完整学习资料

相关职位

Senior System Software Engineer - AI Performance and Efficiency Tools

社招

A key part of NVIDIA's strength is our sophisticated analysis / debugging tools that empower NVIDIA engineers to improve perf and power efficiency of our products and the running applications. We are looking for forward-thinking, hard-working, and creative people to join a multifaceted software team with high standards! This software engineering role involves developing tools for AI researchers and SW/HW teams running AI workload in GPU cluster.As a member of the software development team, we will work with users from different departments like Architecture teams, Software teams. Our work brings the users intuitive, rich and accurate insight in the workload and the system, and empower them to find opportunities in software and hardware, build high level models to propose and deliver the best hardware and software to our customers, or debugging tricky failures and issues to help improve the performance and efficiency of the system. What you’ll be doing: • Build internal profiling and analysis tools for AI workloads at large scale • Build debugging tools for common encountered problems like memory or networking • Create benchmarking and simulation technologies for AI system or GPU cluster • Partner with HW architects to propose new features or improve existing features with real world use cases

更新于 2025-06-19上海

Senior Data Engineer@

社招系统开发

• Understand business scenarios and design targeted data acquisition solutions, ensuring data is relevant, high-quality, and aligned with project goals. • Architect, design, and maintain enterprise-grade databases, data warehouses, and lakehouse systems to support analytical, operational, and AI workloads. • Model and optimize schema design, storage layouts, data partitioning, clustering, and indexing strategies for large-scale datasets. • Implement and maintain ETL/ELT pipelines feeding data warehouses (e.g., Snowflake, BigQuery, Redshift, Databricks, or open-lakehouse environments). • Design, collect, and maintain high-quality datasets for AI inferencing and LLM model optimization, fine-tuning, and testing, ensuring data is formatted and preprocessed to meet model requirements. • Collaborate with AI application engineers to understand model performance requirements and translate them into targeted data collection and preparation strategies. • Develop and implement automated data pipelines for efficient data processing, including data cleaning, labeling, augmentation, and transformation. • Proactively identify data gaps based on model performance metrics, design solutions to acquire, clean, and optimize data for enhanced model accuracy and efficiency. • Build, clean, and manage diverse data sources, ensuring compliance with data security and privacy standards. • Conduct exploratory data analysis to discover data patterns, anomalies, and optimization opportunities, directly impacting model performance. • Continuously learn and adapt to the latest advancements in data engineering, AI, and large language model (LLM) technologies.

更新于 2026-01-14阿布扎比

Senior ASIC Engineer - PMU

社招

As chip sizes continue to grow, power efficiency has become paramount across all applications - from data centers to automotive and personal computing. Our PMU IP, developed over the past 13 years, is crucial in optimizing chip performance and efficiency in both idle and active scenarios. The PMU IP consists of a RISC-V core and custom-designed control logic. It collects and processes data from the entire chip, working in tandem with software running on the RISC-V core to determine optimal operating points. We are seeking a Senior ASIC Engineer who can help architect the next generation PMU for AI datacenter. What you’ll be doing: • Collaborate with the production SW team and power arch team to define the architecture/micro-architecture for various power features. • Learn how PMU's function impacts the system and support the silicon debug. • Implement the micro-architecture to RTL design.

更新于 2025-07-03上海

Senior Networking Test Engineer

社招

• Contribute to design review and product features requirements under the whole Ethernet/ NIC/DPU/Switch portfolio. Design and build setup topologies with an emphasis on an emulation of customer large scale / complex environments. • Collaborating closely with multi-functional teams, including hardware engineers, software developers, and domain experts, to deliver optimized solutions that meet the demanding requirements of AI workloads. • Design, mentorship for testing automation team to implement tests. Generate comprehensive test reports during release execution procedure, assist with reproduction and debugs complex customer use cases, with determination of the issue root cause, be an engineering PIC for the full verification cycles of the customer use cases. • Complete end-to-end test scenarios in different scopes: Regression, Performance, Functional and Scale; Report the progress of testing and provide summary reports of testing activity. • Profiling, Benchmarking, and Analyzing Deep Learning models to identify areas for optimization and improvement in terms of performance, efficiency, and accuracy, with a strong emphasis on networking aspects. • Providing insights and recommendations based on the analysis of large-scale training results, specifically focusing on networking bottlenecks and optimizations, to improve model outcomes and achieve business objectives.

更新于 2025-12-01上海|北京