英伟达Senior Software Engineer – Simulation and Virtualization
任职要求
• Proficient in C/C++ with strong software development, optimization, user & kernel mode debugging skills. • OS fundamentals and system architecture understanding like low-level interfaces such as buses, controllers, interrupts etc. • Good understanding of hypervisors & HW emulators, like QEMU, KVM, VDK, Simics, etc. • Working experience on any one major Linux distro like Ubuntu, RedHat, SLES etc. • Strong interpersonal & communication skills to work with a globally distributed engineering team. • Bachelor’s degree in computer science or related (or equivalent experience) with 5+…
工作职责
NVIDIA data center systems, such as DGX and HGX, have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We are hiring Sr. Software Engineer who will help build simulators for our DGX Server platforms. Simulations play a significant role in building scalable systems at Speed of Light! You will work with world class engineering teams across HW and SW. What you’ll be doing: • Contribute to architect and develop simulation platform for next-gen NVIDIA DGX platforms. • Build, integrate and enhance simulator components with new HW features and write supporting technical documents. • Bring full SW stack up on DGX Simulator; work closely with hardware modeling, kernel & platform driver teams distributed globally. • Improve performance, fix bugs across user and kernel stack, and automate execution flow.
The Tesla Sensing Team is responsible for architecting, designing, validating and integrating state-of-the-art sensing technologies for all Tesla vehicles. The sensors we develop support ADAS (Autopilot), Safety, HVAC, chassis and other subsystems. We are looking for a Software and Algorithms Engineer to join our dynamic team in a fast-paced environment. As a Sensors Algorithms Engineer, you will be responsible for designing, prototyping, implementing and validating the algorithms for various sensors. You will work closely work with platform software engineers, hardware design engineers, mechanical engineers to address issues discovered from development through deployment, understand and debug hardware and software interactions from prototyping through product launch, manage priorities and support a host of projects simultaneously. Responsibilities • Design and develop the algorithms given the sensor product requirements and the feature definition. • Breakdown the feature definition into viable use cases to aid with the implementation and testing of the algorithm • Modeling and simulation of system designs to demonstrate performance metrics • Develop unique solutions and efficient implementations of algorithms for challenging applications • Work with the Hardware design and Mechanical engineers to prototype the implementation • Define and lead the execution of experiments and trials to collect data in lab or field • Use common industry signal processing tools such as IDL, MATLAB, Python (Pytorch) or C++ to analyze data and generate algorithms • Work with the Platform Software engineer to optimize or implement the design on embedded platform in the production software • Validation: generate/execute the test plan that includes milestones that focus on de-risking product and/or feature deployment and align with the program timeline • Collaborate with a broad range of cross-functional teams including mechanical, software, program management, and senior leadership • Potential for broad range of problems and need to have the willingness to learn and tackle new problems • Define KPIs and design framework to assess the performance of the algorithms in production
THE ROLE We're the small, expert team creating the next-generation server-side infrastructure to support the manufacturing and functionality of fleets of Tesla products, and we're looking for seasoned SREs with domain expertise in one or more of: containers, public clouds and cloud-native apps. Today, Tesla owners rely on our services to safely and securely summon their cars with a tap on their mobile phones -- a feature enabled by one of the many over-the-air updates we've delivered to the Tesla vehicle fleet. Tesla engineering relies on our data and analytics platform to make Tesla products better and safer. And, when an owner needs assistance, Tesla service and support rely our applications to understand and respond to the situation. Tomorrow, we will apply fleet learning to dispatch and deliver real-time road conditions to millions of autonomous vehicles and manage distributed energy generation & storage at grid scale. Join us and you will work alongside world-class software and data engineers on some of the newest and most challenging IoT, manufacturing and service engineering problems in the world today. The platform you help us build and automate will be used daily by millions of Tesla owners (and tens of thousands of Tesla employees) to improve and enhance the functionality of our cars, chargers, and batteries worldwide. RESPONSIBILITIES Design and write software that enables rapid prototyping by development teams, while ensuring the highest levels of reliability and availability. Work directly with our factory firmware team to provide highly available factory-facing services. Drive the migration of large-scale, distributed fleet applications towards cloud-native microservices. Influence architectural decisions with focus on security, scalability and high-performance. Automate the build and deployment of infrastructure using Docker, Kubernetes & other orchestration technologies in a hybrid-cloud environment. Setup and maintain monitoring, metrics & reporting systems for fine-grained observability and actionable alerting.
• Providing Ethernet and routing expertise to customers during project delivery to design, architect and test Ethernet networking solutions. • Work on multi-functional teams to provide Ethernet network expertise to server infrastructure builds, accelerated computing workloads and GPU enabled AI applications. • Crafting and evaluating DevOps automation scripts for network operations, crafting network architectures, and developing switch fabric configurations. • Implementing tasks related to network configuration and validation for data centers. • Create Methods of Procedure and deployment documents. • Use software tools to validate and monitor network performance.
• Providing Ethernet and routing expertise to customers during project delivery to design, architect and test Ethernet networking solutions. • Work on multi-functional teams to provide Ethernet network expertise to server infrastructure builds, accelerated computing workloads and GPU enabled AI applications. • Crafting and evaluating DevOps automation scripts for network operations, crafting network architectures, and developing switch fabric configurations. • Implementing tasks related to network configuration and validation for data centers. • Create Methods of Procedure and deployment documents. • Use software tools to validate and monitor network performance.