英伟达Senior Solutions Architect - CRISP System
任职要求
• BS/MS/PhD in Engineering, Mathematics, Physics, or Computer Science, or equivalent experience • 5+ years of work-related experience in deep learning/AI, AI system design, DevOps, cloud software development or high-performance computing. • Familiar with GPU/AI accelerators and ecosystems. • Experience in Linux …
工作职责
• Work with sales to introduce NVIDIA technologies and products. • Account owner to promote products to customers, and bring feedback to product team. • Private or public workshops to illustrate and output NVIDIA’s offerings in details. • Debugging, tuning, testing during qualification, POC, integration and pilot. • Build good relationship with all levels of customers and become a trusted advisor. • Discover opportunities and guide customers to suitable solution. • Share knowledge across teams.
• Primary responsibilities will include building AI/HPC infrastructure for new and existing customers. • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting. • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement. • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health. • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
• Design, implement, and optimize scalable ML training pipelines for training multimodal foundation models for robotics. • Collaborate with researchers to integrate cutting-edge model architectures into scalable training pipelines. • Implement scalable data loaders and preprocessors for multimodal datasets, such as videos, text, and sensor data. • Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets. • Develop robust monitoring and debugging tools to ensure the reliability and performance of training workflows on large GPU clusters.
• Develop and maintain simulation environments built on frameworks like MuJoCo, and Isaac Lab to support robotics research. • Implement and test control algorithms and XR teleoperation interfaces for simulated robots. • Build procedural generation pipelines for diverse environments, object layouts, and robot motions. • Optimize GPU-based physics simulator performance for large-scale training workloads. • Import, configure, and validate robot assets in USD format, ensuring successful sim2real transfer. • Implement Sim2Real pipelines and deploy learned models to physical robots.
NVIDIA networking designs and manufactures high-performance networking equipment that enable the most powerful super computers in the largest data centers in the world. With a distributed collection of NVIDIA GPUs inter-connected by networking solutions such as InfiniBand, Ethernet, or RoCE (RDMA over Converged Ethernet) we make powerful ML/AI platforms possible. We are seeking motivated, personable, and independent individuals to join our team!We seek experienced software embedded engineers to help support our groundbreaking, innovative technologies that make AI workloads in large clusters even more performant. As a networking Sr. Solutions Architect at NVIDIA you will have agency and palpable effects on the business, and work closely with customers and R&D teams. What you’ll be doing: • Support networking technologies such as Spectrum-X and work with customers on their technical challenges and requirements using said technologies during pre-sales activities • Develop proof-of-concept materials for innovative technologies for use by early adopters • Gain customers’ trust and understand their needs to help design and deploy groundbreaking NVIDIA networking platforms to run AI and HPC workloads • Address sophisticated and highly visible customer issues • Work closely with R&D teams to develop new features for customers • Help with product requirements alongside engineering and product marketing