英伟达Deep Learning Compiler CI/Infrastructure Engineer
社招全职地点:上海状态:招聘
任职要求
• BS, MS, or PhD (or equivalent experience) in Computer Science, Computer/Electrical Engineering, Mathematics, or a related field • 5+ years of experience designing, scaling, and operating CI/CD, build/release, or developer infrastructure for complex software systems • Proven experience building CI platforms end-to-end using systems such as GitLab CI, Jenkins, or similar tools, including pipeline orchestration, compute/runner management, artifact and package systems, and observability, with strong emphasis on reliability, reproducibility, and debuggability • Strong software engineering skills (Python required), with the ability to design, implement, and debug distributed systems end-to-end • Familiarity with edge devices (SOC, e.g. NVIDIA Tegra) in host-target architecture, ability to debug it and knowledge to its automation nuances • Proven track record of designing, building, and deploying AI/LLM-based systems in real engineering workflows, demonstrating skill in evaluating trade-offs, failure modes, maintainability, and measurable impact on developer productivity, signal quality, o…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
• Build, maintain, and improve CI infrastructure that supports development, verification, and release of NVIDIA’s deep learning compiler stacks across GPU and accelerator environments • Improve CI reliability and signal quality by reducing flakes, improving reproducibility, strengthening diagnostics, and making correctness and performance failures easier to understand and act on • Apply automation, AI, and agent-based workflows to reduce manual CI operations, speed up failure triage, and improve developer efficiency • Build reusable and self-service CI platforms that support multiple products, projects, model suites, hardware targets, and software configurations while partnering closely with compiler, infrastructure, and release teams
包括英文材料
CI+
https://www.ibm.com/cn-zh/think/topics/continuous-integration
持续集成 (CI) 是一种软件开发实践,开发人员在整个开发周期中会定期将新的代码和代码变更集成到中央代码存储库中。它是 DevOps 和敏捷方法的关键组成部分。
https://www.youtube.com/watch?v=42UP1fxi2SY
CD+
https://www.redhat.com/zh-cn/topics/devops/what-is-ci-cd
CI/CD 是持续集成和持续交付/部署的缩写,旨在简化并加快软件开发生命周期。
https://www.youtube.com/watch?v=R8_veQiYBjI&list=PLy7NrYWoggjzSIlwxeBbcgfAdYoxCIrM2
GitLab+
https://docs.gitlab.com/tutorials/
Learn about GitLab fundamentals by following guided instructions.
Jenkins+
https://www.youtube.com/watch?v=f4idgaq2VqA
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
SOC+
https://www.arm.com/resources/education/books/modern-soc
The aim of this textbook is to expose aspiring and practising SoC designers to the fundamentals and latest developments in SoC design and technologies using examples of Arm Cortex-A technology and related IP blocks and interfaces.
https://www.arm.com/resources/education/education-kits/introduction-to-soc
To produce students with solid introductory knowledge on the basics of SoC design and key practical skills required to implement a simple SoC on an FPGA and write embedded programs targeted at the microprocessor to control the peripherals.
https://www.youtube.com/watch?v=dokgLSAhqHI
A key part of the digital innovation revolution has been the embrace of the SoC, or system-on-chip.
还有更多 •••