logo of nvidia

英伟达Senior DevOps Engineer - AI and AV Infrastructure

社招全职地点:上海 | 北京 | 深圳状态:招聘

任职要求


NVIDIA has become the platform upon which every new AI-powered application is built. From healthcare research applications to autonomous vehicles, or voice-recognition systems, the need for advanced perception and cognitive capabilities is exploding... and NVIDIA is right in the center of this revolution. We are seeking a motivated Senior DevOps Engineer to join our Autonomous Vehicle Infrastructure organization, focusing on building, deploying, and operating validation platforms at scale. In this role, you will work with internal teams and external partners to integrate distributed systems, manage large-scale data pipelines, and operationalize next-generation validation workflows for autonomous driving.This role offers a chance to start from the ground up: standing up new vendor-provided platforms, validating integration paths, and ensuring infrastructure is reliable, secure, and production-ready. You will combine hands-on engineering, infrastructure deployment, and workflow automation to help scale our AV validation ecosystem.
What You’ll Be Doing:
• Deploy and operationalize vendor-provided platforms in our service cloud, starting with proof-of-concept environments to validate dependencies, workflows, and performance.
• Build and maintain distributed infrastructure that supports large-scale log ingestion, data processing, and scenario validation at scale.
• Automate workflows and pipelines using Python, Bash, and Bazel to ensure reproducibility, efficiency, and reliable distributed execution.
• Integrate simulation and drive logs (e.g., parquet, world model data) with validation platforms, ensuring seamless end-to-end coverage analysis.
• Provide visualization and reporting capabilities to surface validation results, coverage metrics, and ac…
登录查看完整任职要求
微信扫码,1秒登录

工作职责


N/A
包括英文材料
DevOps+
Python+
Bash+
Parquet+
安全防护+
Linux+
Kubernetes+
还有更多 •••
相关职位

logo of apple
社招Machine

This role requires a blend of skills in software engineering, machine learning, and operations to ensure the smooth functioning of ML systems in production environments. In this role you will: - Lead the team to design and implement automation for model training, testing, validation, and deployment - Collaborate with machine learning engineers to ensure efficient deployment and scaling of ML models - Implement monitoring and alerting systems to track model performance, system health, and data drift - Optimize compute resources for cost and performance efficiency - Manage model versions to ensure traceability and reproducibility

更新于 2025-07-22上海
logo of microsoft
社招Program

• Lead hands-on design and development efforts primarily using Python, building robust, scalable, and customer-focused AI/ML solutions. • Engage directly with key enterprise customers to strategize, architect and implement AI driven, Agentic AI solutions leveraging Azure AI services including Azure OpenAI, Azure ML. • Translate complex requirements into practical, well-architected technical solutions. • Develop end-to-end, rapid prototypes, involving data ingestion, validation, processing, and model deployment using Azure platform components. • Build, customize, and optimize AI models and related components for customer-specific use cases. • Integrate AI solutions with full-stack architectures, preferably leveraging experience with JavaScript frameworks (e.g., Node.js, React) and/or .NET ecosystems. • Establish and maintain robust CI/CD and ML Ops pipelines, leveraging Azure DevOps, Github for automated deployments. • Proactively explore diverse datasets to engineer novel features and signals that significantly enhance ML performance. • Participate actively in every phase of the model lifecycle from conceptualization, training, fine tuning, validation, and deployment, to continuous monitoring and improvement.

更新于 2025-10-07上海|北京
logo of nvidia
社招

• Providing Ethernet and routing expertise to customers during project delivery to design, architect and test Ethernet networking solutions. • Work on multi-functional teams to provide Ethernet network expertise to server infrastructure builds, accelerated computing workloads and GPU enabled AI applications. • Crafting and evaluating DevOps automation scripts for network operations, crafting network architectures, and developing switch fabric configurations. • Implementing tasks related to network configuration and validation for data centers. • Create Methods of Procedure and deployment documents. • Use software tools to validate and monitor network performance.

更新于 2025-09-18北京|上海|深圳
logo of nvidia
社招

• Providing Ethernet and routing expertise to customers during project delivery to design, architect and test Ethernet networking solutions. • Work on multi-functional teams to provide Ethernet network expertise to server infrastructure builds, accelerated computing workloads and GPU enabled AI applications. • Crafting and evaluating DevOps automation scripts for network operations, crafting network architectures, and developing switch fabric configurations. • Implementing tasks related to network configuration and validation for data centers. • Create Methods of Procedure and deployment documents. • Use software tools to validate and monitor network performance.

更新于 2025-10-22北京|上海|深圳