特斯拉Sr System Engineer - HPC

社招全职IT-基础架构与运营地点：上海状态：招聘

扫码手机上打开

任职要求

Experience with cluster deployment and operations on Linux Operating System flavors (Ubuntu/RHEL).

Advanced experience with configuration management systems such as Ansible.

Demonstrable knowledge of TCP/IP, RoCE, Linux Operating System internals, filesystems, disk/storage technologies and storage protocols

Experience with design, deploy middle to large scale of InfiniBand network.

Proficiency in high-level programming language and/or scripting with (Python, Go, Bash).

Experience with containers (Docker, Kubernetes)

Familiar with Prometheus, Grafana, Splunk for monitoring and alerting.

Administering HPC workload managers (SLURM, BCM etc.).

Experience with high-throughput low-latency network and GPU-based computing systems

Fluently in reading, writing and…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

The Role

Compute is the most important driver in accelerating the maturation of AI enabled products. Today, Tesla is at the forefront of creating meaningful real world products using AI. We design, build and run large scale GPU clusters that enable our teams to build better products faster. We are an extremely small team, and the work of every member carries an immense amount of weight. Working with the team, you will build out performance testing tools, build health check tools, create tools for better metric collection and all other fun projects.

Responsibilities

You’ll be working in a cross-functional and highly versatile team that designs, implements, and maintains HPC technical stacks.

Leverage and improve upon existing cluster management solutions to ensure rapid deployment and scalability.

Ensure the reliability of the existing systems to guarantee uptime and availability of core foundational services.

Influence architectural decisions with focus on security, scalability and high-performance. Work with engineering teams to understand useful metrics to collect and implement such monitoring and alerting with existing monitoring solutions.

Improve root cause analysis and corrective action for problems large and small – identify patterns and design task automations.

Help develop automated tools to collect information that can be directly used to assist users creating root cause analysis for issues in their job submissions.

Organize and document implemented solutions for long term information retention with our internal ticketing and documentation system.

Take part in a 24 x 7 on-call rotation

Must

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

Linux+

Ubuntu+

Ansible+

TCP/IP+

Python+

Go+

还有更多 •••

登录查看完整学习资料

相关职位

Senior Software Engineer – Simulation and Virtualization

社招

NVIDIA data center systems, such as DGX and HGX, have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We are hiring Sr. Software Engineer who will help build simulators for our DGX Server platforms. Simulations play a significant role in building scalable systems at Speed of Light! You will work with world class engineering teams across HW and SW. What you’ll be doing: • Contribute to architect and develop simulation platform for next-gen NVIDIA DGX platforms. • Build, integrate and enhance simulator components with new HW features and write supporting technical documents. • Bring full SW stack up on DGX Simulator; work closely with hardware modeling, kernel & platform driver teams distributed globally. • Improve performance, fix bugs across user and kernel stack, and automate execution flow.

更新于 2025-09-22上海|北京|深圳

（Sr）System Engineer

社招低压电子系统

We are seeking a highly motivated and experienced embedded SW Engineer to join our LV Electronic Module Design team in Tesla Shanghai. The candidate will play a crucial role in algorithm design and implementation, embedded software development and module/vehicle level testing and validation. The candidate should have a good understanding and hands-on experience on RTOS, ARM architecure, embedded system hardware, etc. Knowledge and experience on wireless phone charging, like Qi and high power priviate protocols are high appreciated. Responsibilities: Archietect, code and debug embedded software in C/C++ for microcontrollers to implement functions required in electronic modules. Develop device drivers (SPI/I2C/UART/USB/CAN), RTOS modules and implement communication protocal stacks Collaborate closely with the EE/FW engineers to ensure seamless integration and system level evaluations. Troubleshoot using oscilloscopes, logic analyzers, JTAG debuggers, and protocol analyzers. Support the manufactoring testing and CI. Support the production FW build and release.

上海

Sr. Battery System Engineer, Battery Management System

社招Hardware

As a Battery System Engineer, you will engage with an experienced cross-disciplinary staff to conceive, and design innovative consumer product. You will work closely with an internal interdisciplinary team, and outside partners to drive key aspects of product definition and execution. You must be responsive, flexible, and able to succeed within an open collaborative peer environment. In this role, you will: 1. Lead the design, development, and delivery of Li-ion battery system per performance and safety requirements 2. Drive battery development from NPI through mass production 3. Research and evaluate emerging battery technologies 4. Collaborate with product teams to define battery specifications 5. Design battery protection circuit and pack design for NPI programs include schematic design, and component selection. 6. Develop and review battery pack schematics, BOMs and layout to meet design requirements 7. Conduct system and design reviews, failure mode and effects analysis (DFMEA), and risk assessments 8. Analyze and resolve battery-related issues in production and field 9. Perform battery safety assessment and design for safety 10. Support battery certification processes (CTIA/IEEE1725) 11. Manage and coordinate with CMs (contract manufacturers) on battery development for NPI programs 12. Build and maintain strong relationships with suppliers and manufacturing partners

更新于 2025-10-02上海|深圳

Sr. Battery System Engineer

社招Research

As a Battery System Engineer, you will engage with an experienced cross-disciplinary staff to conceive, and design innovative consumer product. You will work closely with an internal interdisciplinary team, and outside partners to drive key aspects of product definition and execution. You must be responsive, flexible, and able to succeed within an open collaborative peer environment. The role operates primarily in high-ambiguity problem spaces, requires original scientific judgment, and produces reusable scientific assets that influence multiple programs, suppliers, and long-term roadmaps. In this role, you will: 1. Lead the design, development, and delivery of Li-ion battery system per performance and safety requirements 2. Drive battery development from NPI through mass production 3. Research and evaluate emerging battery technologies 4. Collaborate with product teams to define battery specifications 5. Design battery protection circuit and pack design for NPI programs include schematic design, and component selection. 6. Develop and review battery pack schematics, BOMs and layout to meet design requirements 7. Conduct system and design reviews, failure mode and effects analysis (DFMEA), and risk assessments 8. Analyze and resolve battery-related issues in production and field 9. Perform battery safety assessment and design for safety 10. Support battery certification processes (CTIA/IEEE1725) 11. Manage and coordinate with CMs (contract manufacturers) on battery development for NPI programs 12. Build and maintain strong relationships with suppliers and manufacturing partners

更新于 2026-03-30深圳