logo of nvidia

英伟达Senior CUDA Test Development Software Engineer

社招全职地点:深圳状态:招聘

任职要求


• MS or PhD degree from a leading university in computer science or a related field.
• At least 3 years of relevant professional experience.
• Excellent QA sense, knowledge, and experience in software testing.
• Rich experience in test case development, tests automation and failure analysis.
• Proficient programming and debugging skills in C/C++ and Python.
• Comprehensive knowledge of Linux and Windows operating systems.
• Experience in using AI development tools for test plans creation, test cases development and test cases automation.

Ways to stand out from the crowd: 
• Excellent English communication and collaboration skills.
• Strong understanding of CUDA, HPC, Gcov, VectorCAST, Coverity.

工作职责


• Design and implement functional/performance tests for CUDA products, like driver and library.
• Automate CUDA tests, design test plans and integrate into automation testing infrastructure.
• Triage test results, root cause test failures or performance drops, and drive through bugs to fix.
• Develop scripts/tools and optimize workflow to improve efficiency and productivity.
包括英文材料
质量保证+
C+
Python+
Linux+
Windows+
CUDA+
HPC+
相关职位

logo of nvidia
社招

NVIDIA is now looking for LLM Train Framework Engineers for the Megatron Core team. Megatron Core is open-source, scalable, and cloud-native frameworks built for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining and post-training. Our GenAI Frameworks provide end-to-end model training, including pretraining, alignment, customization, evaluation, deployment, and tooling to optimize performance and user experience. Build on Megatron Core Framework's capabilities by inventing advanced distributed training algorithms and model optimizations. Collaborate with partners to implement optimized solutions. What you’ll be doing: • Build and develop open source Megatron Core. • Address extensive AI training and inference obstacles, covering the entire model lifecycle including orchestration, data pre-processing, conducting model training and tuning, and deploying models. • Work at the intersection of AI applications, libraries, frameworks, and the entire software stack. • Spearhead advancements in model architectures, distributed training strategies, and model parallel approaches. • Enhance the pace of foundation model training and optimization through mixed precision formulas and advanced NVIDIA GPU structures. • Performance tuning and optimizations of deep learning framework and software components. • Research, prototype, and develop robust and scalable AI tools and pipelines.

更新于 2025-10-13
logo of nvidia
社招

• Writing highly tuned compute kernels to perform core deep learning operations (e.g. matrix multiplies, convolutions, normalizations) • Following general software engineering best practices including support for regression testing and CI/CD flows • Collaborating with teams across NVIDIA:• CUDA compiler team on generating optimal assembly code • Deep learning training and inference performance teams on which layers require optimization • Hardware and architecture teams on the programming model for new deep learning hardware features

更新于 2025-09-24
logo of nvidia
社招

• Design, build, and harden containers for NIM runtimes, inference backends; enable reproducible, multi-arch, CUDA-optimized builds. • Develop Python tooling and services for build orchestration, CI/CD integrations, Helm/Operator automation, and test harnesses; enforce quality with typing, linting, and unit/integration tests. • Help design and evolve Kubernetes deployment patterns for NIMs, including GPU scheduling, autoscaling, and multi-cluster rollouts. • Optimize container performance: layer layout, startup time, build caching, runtime memory/IO, network, and GPU utilization; instrument with metrics and tracing. • Evolve the base image strategy, dependency management, and artifact/registry topology. • Collaborate across research, backend, SRE, and product teams to ensure day-0 availability of new models. • Mentor teammates; set high engineering standards for container quality, security, and operability.

更新于 2025-09-15
logo of nvidia
社招

We are now looking for a Senior Solutions Architect, an outstanding engineer able to engage with developers, researchers, and decision makers. We need individuals who can use AI to improve system efficiency and develop close relationships with our industry customers, making NVIDIA a great part of end-user solutions.NVIDIA is the world leader in GPU accelerated computing and is looking for Solution Architects to engage our customers. The Senior Solution Architect will work closely with the industry customers in our China region - establishing relationships, solving problems with their engineering teams, and helping them to build a successful NVIDIA practice. If interested, do not hesitate to apply online, we are exciting to talk with you! What you’ll be doing: • Presenting NVIDIA’s full stack Artificial Intelligence solutions, and end to end platform technology to customers and partners. With in-depth hands-on engagements with customers or NVIDIA partners on complicated Datacenter projects. • Assist field business development in guiding the customer through the sales process for NVIDIA solution. • Understand and analyze customer requirements, support the solution design and development of applications. • Team work across the company to guide the direction of accelerated computing, working with software, research, and product teams. • Document the learnings to guide others. This can vary from making targeted training for customers and other Solutions Architects, giving nice hands-on demos, writing whitepapers, blogs, and wiki articles, recording short videos, to simply working through hard problems with a partner on a whiteboard.

更新于 2025-07-08