AMDAI Software Product Engineer(GPU)
任职要求
Success in this role will require deep knowledge of Data Center, Client, Endpoint AI workloads such as LLM, Generative AI, Recommendation, and/or transformer … AI cross cloud, client, edge… the candidate needs to have hands-on experiences with various AI models, end-to-end pipeline, industry framework (pytrouch, vLLM, SGLang, llm-d,Triton) / SDKs and solutions. KEY RESPONSIBILITIES: Position technical proposals / enablement to (blogs, tutorials, …
工作职责
THE ROLE: “AI Product Applications Engineer (Solution Architect) – China” position is in the AMD AI group, located in China.
We are looking for a Software Test development engineer in NVIDIA’s AI SWQA team.The position is in NVIDIA AI Software Quality Assurance team that defines, develops and performs tests to validate robustness and measure the performance of NVIDIA‘s AI software and GPU Infrastructure for autonomous driving, healthcare, speech recognition, natural language processing, and a wide variety of other AI scenarios. This team collaborates with multiple AI product teams to develop new products; derive and improve complex test plans; and improve our workflow processes for a diverse range of GPU computing platforms. You should grow with being in the critical path supporting developers working for billion-dollar business lines as well as intimately understanding the values of responsiveness, thoroughness and teamwork. You should constantly foster and implement efficiency improvements across your domain. Join the team which is building software which will be used by the entire world! What you’ll be doing: • Work closely with global cross-functional teams to understand the test requirements and take ownership of product quality. • Plan/design/execute/report/automate test plan/test case/test reports. • Manage bug lifecycle and co-work with inter-groups to drive for solutions. • Automate test cases and assist in the architecture, crafting and implementing of test frameworks. • In-house repro and verify customer issues/fixes.
We are looking for a Software Test development engineer in NVIDIA’s AI SWQA team.The position is in NVIDIA AI Software Quality Assurance team that defines, develops and performs tests to validate robustness and measure the performance of NVIDIA‘s AI software and GPU Infrastructure for autonomous driving, healthcare, speech recognition, natural language processing, and a wide variety of other AI scenarios. This team collaborates with multiple AI product teams to develop new products; derive and improve complex test plans; and improve our workflow processes for a diverse range of GPU computing platforms. You should grow with being in the critical path supporting developers working for billion-dollar business lines as well as intimately understanding the values of responsiveness, thoroughness and teamwork. You should constantly foster and implement efficiency improvements across your domain. Join the team which is building software which will be used by the entire world! What you’ll be doing: • Work closely with global cross-functional teams to understand the test requirements and take ownership of product quality. • Plan/design/execute/report/automate test plan/test case/test reports. • Manage bug lifecycle and co-work with inter-groups to drive for solutions. • Automate test cases and assist in the architecture, crafting and implementing of test frameworks. • In-house repro and verify customer issues/fixes.
• Design, build, and harden containers for NIM runtimes, inference backends; enable reproducible, multi-arch, CUDA-optimized builds. • Develop Python tooling and services for build orchestration, CI/CD integrations, Helm/Operator automation, and test harnesses; enforce quality with typing, linting, and unit/integration tests. • Help design and evolve Kubernetes deployment patterns for NIMs, including GPU scheduling, autoscaling, and multi-cluster rollouts. • Optimize container performance: layer layout, startup time, build caching, runtime memory/IO, network, and GPU utilization; instrument with metrics and tracing. • Evolve the base image strategy, dependency management, and artifact/registry topology. • Collaborate across research, backend, SRE, and product teams to ensure day-0 availability of new models. • Mentor teammates; set high engineering standards for container quality, security, and operability.
We are now looking for a Deep Learning Software QA Engineer Intern!The position is in NVIDIA Deep Learning Software Quality Assurance team that defines, develops and performs tests to validate robustness and measure the performance of NVIDIA‘s Deep Learning software and GPU Infrastructure for autonomous driving, healthcare, speech recognition, natural language processing, and a wide variety of other AI scenarios. This team collaborates with multiple AI product teams to develop new products; derive and improve complex test plans; and improve our workflow processes for a diverse range of GPU computing platforms. You should grow with being in the critical path supporting developers working for billion-dollar business lines as well as intimately understanding the values of responsiveness, thoroughness and teamwork. You should constantly foster and implement efficiency improvements across your domain. Join the team which is building software which will be used by the entire world! What you’ll be doing: • Be responsible for functionality, compatibility, and performance tests in NVIDIA AI software stack release. • Develop, maintain, and improve test automation infrastructure with using AI tools. • Work with development teams to triage issues, root cause analysis, verify fixes, define new tests, improve test plans.