腾讯大模型推理框架研发工程师
社招全职3年以上腾讯云技术地点:杭州状态:招聘
任职要求
1.熟练掌握C/C++、Python编程语言,具备良好的coding和调试能力; 2.熟悉主流大模型推理框架,如vllm,sglang,tensorrt-llm等,具备语言、多模态模型大规模部署和优化经验; 3.熟悉并行策略,如数据并行、流水线并行等,熟悉NVLINK、GPU RDMA通信者优先; 4.熟悉各类深度学习网络和算子底层实现细节,有实操经验优先; 5.熟悉主流开源模型及其架构特点,具备针对不同模型进…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1.研发及优化大模型推理引擎、PD分离推理调度系统,提升大规模分布式推理系统的整体效率; 2.支持主流GPU和异构AI芯片,优化大模型推理性能,打造极致性能成本优势。
包括英文材料
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
vLLM+
https://www.newline.co/@zaoyang/ultimate-guide-to-vllm--aad8b65d
vLLM is a framework designed to make large language models faster, more efficient, and better suited for production environments.
https://www.youtube.com/watch?v=Ju2FrqIrdx0
vLLM is a cutting-edge serving engine designed for large language models (LLMs), offering unparalleled performance and efficiency for AI-driven applications.
还有更多 •••