米哈游LLM Posttrain算法研究员
校招全职程序&技术类地点:上海 | 北京状态:招聘
任职要求
1. 计算机科学、人工智能或相关领域的硕士/博士 2. 熟悉 Transformer 架构,熟练使用 PyTorch 及主流大模型训练框架(如 DeepSpeed, Megatron-LM, vLLM 等) 3. 有 SFT、RLHF 的实际操作经验,理解训练过程中的稳定性与效率问题 4. 具备优秀的工程实现能力和快速复现Paper的能力 加分项 1.…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1. 前沿算法探索:紧跟领域前沿技术,探索有效和高效的 RLHF 或 RLAIF 等post-training方法,提升模型在复杂逻辑任务中的推理能力 2. 对齐策略研究: 探索 PPO、DPO、GRPO 等 Post-training 算法的改进,优化模型在指令跟随、多轮对话一致性等方面的表现 3. 高质量数据工程: 负责 SFT 与 RLHF 阶段的数据治理,探索合成数据、数据演化及数据混合策略,解决数据稀缺性问题 4. 长窗口与记忆: 参与 Long Context 训练技术的优化,提升模型在长序列下的注意力保持与信息检索能力
包括英文材料
Transformer+
https://huggingface.co/learn/llm-course/en/chapter1/4
Breaking down how Large Language Models work, visualizing how data flows through.
https://poloclub.github.io/transformer-explainer/
An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.
https://www.youtube.com/watch?v=wjZofJX0v4M
Breaking down how Large Language Models work, visualizing how data flows through.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
DeepSpeed+
https://www.youtube.com/watch?v=pDGI668pNg0
Megatron+
https://www.youtube.com/watch?v=hc0u4avAkuM
vLLM+
https://www.newline.co/@zaoyang/ultimate-guide-to-vllm--aad8b65d
vLLM is a framework designed to make large language models faster, more efficient, and better suited for production environments.
https://www.youtube.com/watch?v=Ju2FrqIrdx0
vLLM is a cutting-edge serving engine designed for large language models (LLMs), offering unparalleled performance and efficiency for AI-driven applications.
SFT+
https://cameronrwolfe.substack.com/p/understanding-and-using-supervised
Understanding how SFT works from the idea to a working implementation...
还有更多 •••