
Momenta资深模型推理优化工程师
社招全职3年以上地点:北京 | 深圳 | 上海状态:招聘
任职要求
岗位要求 1. 工作3年以上,计算机/数学/物理/电子工程/自动控制专业硕士及以上学历。 2. 优秀的代码能力,熟练掌握C/C++或Python,有 CUDA 开发经验,熟悉TVM/TensorRT/Triton/Cutlass等推理框架优先。 3. 熟悉计算机体系结构,对GPU、NPU等有比较深刻的理解。 4. 熟悉CNN/Transformers/DETR等主流模型原理。 5. 良好的团队协作能力;创新性强,有良好的动手实现能力,对技术有热情。
工作职责
岗位职责 1. 负责智能驾驶模型在多种不同硬件平台的极致性能优化,结合编译优化、并行计算优化、图融合、高效 CUDA 算子开发实现行业领先的车端推理性能。 2. 针对特定NPU硬件计算平台,基于对硬件体系结构的深入理解,实现对硬件的高效利用。 3. 针对Pytorch/CUDA相关GPU计算任务,进行算子和系统优化,提升训练/推理效率。
包括英文材料
学历+
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
CUDA+
https://developer.nvidia.com/blog/even-easier-introduction-cuda/
This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA.
https://www.youtube.com/watch?v=86FAWCzIe_4
Lean how to program with Nvidia CUDA and leverage GPUs for high-performance computing and deep learning.
TensorRT+
https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/quick-start-guide.html
This TensorRT Quick Start Guide is a starting point for developers who want to try out the TensorRT SDK; specifically, it demonstrates how to quickly construct an application to run inference on a TensorRT engine.
Triton Inference Server+
https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html
Triton Inference Server is an open source inference serving software that streamlines AI inferencing.
CNN+
https://learnopencv.com/understanding-convolutional-neural-networks-cnn/
Convolutional Neural Network (CNN) forms the basis of computer vision and image processing.
[英文] CNN Explainer
https://poloclub.github.io/cnn-explainer/
Learn Convolutional Neural Network (CNN) in your browser!
https://www.deeplearningbook.org/contents/convnets.html
Convolutional networks(LeCun, 1989), also known as convolutional neuralnetworks, or CNNs, are a specialized kind of neural network for processing data.
https://www.youtube.com/watch?v=2xqkSUhmmXU
MIT Introduction to Deep Learning 6.S191: Lecture 3 Convolutional Neural Networks for Computer Vision
相关职位

社招算法
职位描述 - 研发基于深度学习的自动驾驶感知和预测的前沿技术; - 根据应用场景和客户需求定义,提供模型压缩训练优化及推理优化相关方案(包含算法及工程); - 跟踪、分析、评估各主流深度学习框架; 职位要求 - 计算机/数学/物理/电子工程/自动控制专业硕士及以上学历,AI相关研究方向; - 精通C++,熟悉典型的计算机体系结构,有分布式性能优化经验,有出色的编程能力。 - 熟悉至少一种深度学习框架;具有2年以上深度学习框架开发经验。了解分布式训练,模型并行相关技术。 - 熟悉CUDA/TensorRT或其它AI加速库开发经验; - 熟练掌握Linux 应用环境、 有shell脚本编程经验;熟悉Python语言。
更新于 2024-04-18
社招5年以上D8039
1.负责AI平台架构设计和AI工程化技术实现; 2.通过AI基础设施和软硬件协同优化来提升公司AI模型训练和推理的效率; 3. 负责云侧或端侧大模型和小模型推理服务开发、性能优化、上线部署等工作。
更新于 2025-04-01
社招MEG
-参与公司深度学习推理引擎、AI编译器的架构设计、开发和优化,确保领域内推理引擎技术的先进性 -参与多模态LLM、视频生成模型推理性能性能的极致优化,保持业界SOTA -研究最近的推理优化技术,跟踪最新研究进展和技术趋势,提出改进和创新的想法,推动团队的技术发展,并应用到业务 -和团队一起攻克高性能、高并发、高可用性等各种不同技术场景下的技术挑战
更新于 2025-04-25