千问千问事业部-大模型推理框架系统研发专家-杭州/北京/广州
社招全职3年以上地点:北京 | 杭州 | 广州状态:招聘
任职要求
1. 精通C++/Python,熟悉CUDA编程和GPU推理系统开发,具备扎实的系统软件、分布式系统、并行计算和高性能服务端开发基础; 2. 深入理解大模型推理框架核心机制,包括Prefill/Decode、KV Cache管理、Continuous Batching、PagedAttention、Speculative Decoding、Quantization等; 3. 熟悉分布式推理并行与通信优化,包括Tensor Parallel、Pipeline …
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1. 参与大模型推理框架和分布式Serving系统的核心研发,面向语言模型、多模态模型和Agentic应用场景,建设高性能、高稳定性、高扩展性的生产级推理引擎; 2. 研发并优化Prefill/Decode分离、KV Cache管理、长上下文推理、多租户调度、连续批处理、SLO感知调度等核心系统能力,提升TTFT、TPOT、吞吐和资源利用率; 3. 参与分布式推理并行与通信优化,包括Tensor Parallel、Pipeline Parallel、Expert Parallel、Context Parallel、MoE All-to-All、跨机通信重叠、负载均衡和故障恢复等; 4. 协同算子、模型压缩、模型算法和业务团队,推动量化、投机解码、MoE优化、多模态推理、长上下文优化等能力在推理框架中生产落地,解决线上性能、稳定性和成本优化问题。
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
CUDA+
https://developer.nvidia.com/blog/even-easier-introduction-cuda/
This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA.
https://www.youtube.com/watch?v=86FAWCzIe_4
Lean how to program with Nvidia CUDA and leverage GPUs for high-performance computing and deep learning.
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
还有更多 •••