小红书基础大模型算法专家
社招全职3-5年大模型地点:上海 | 北京 | 杭州状态:招聘
任职要求
1、背景: 计算机、电子、数学等相关专业硕士/博士;深入理解大模型训练、推理和数据构建流程; - 2、专业深耕:在预训练(数据配比,模型结构,AI Infra)、SFT(e.g. 数据合成、拒绝采样)、强化学习(e.g. Reward Model,GRPO/PPO) 或 模型推理(e.g. 投机解码)等领域有…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、预训练:模型架构设计和实现,持续打磨多阶段预训练技巧,结合自动化和人工筛选,持续迭代化数据配比方案(质量、类别分布、难度等),训练全尺寸Dense和MoE模型,以及探索Hybrid架构、Diffusion训练/推理等新一代大模型范式; 2、后训练:SFT数据合成、拒绝采样、数据配比、模型训练,样本级标签体系建设,RL数据合成、Reward Model设计、router replay、RL算法创新,显著提升alignment阶段模型生成能力; 3、数据&评测:持续改进数据体系pipeline,包括:数据收集、清洗、去重和配比等,合成各种高质量agentic/reasoning训练数据,提升模型通用能力;持续完善大模型评估体系和Bench,能有效评估STEM、math、code、知识、指令跟随、多语言等维度能力。
包括英文材料
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
SFT+
https://cameronrwolfe.substack.com/p/understanding-and-using-supervised
Understanding how SFT works from the idea to a working implementation...
强化学习+
https://cloud.google.com/discover/what-is-reinforcement-learning?hl=en
Reinforcement learning (RL) is a type of machine learning where an "agent" learns optimal behavior through interaction with its environment.
https://huggingface.co/learn/deep-rl-course/unit0/introduction
This course will teach you about Deep Reinforcement Learning from beginner to expert. It’s completely free and open-source!
https://www.kaggle.com/learn/intro-to-game-ai-and-reinforcement-learning
Build your own video game bots, using classic and cutting-edge algorithms.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
Megatron+
https://www.youtube.com/watch?v=hc0u4avAkuM
SGLang+
[英文] Install SGLang
https://docs.sglang.ai/get_started/install.html
SGLang is a fast serving framework for large language models and vision language models.
https://github.com/sgl-project/sgl-learning-materials
还有更多 •••