小红书深度学习推理优化工程师-引擎架构
社招全职后端开发地点:北京 | 上海状态:招聘
任职要求
1. 熟练掌握 C/C++,具备扎实的系统能力、良好的编程习惯与团队协作能力 2. 熟悉 TensorFlow / PyTorch 等主流训练推理框架,理解其执行模型及原理;熟悉 XDL、TFRA、DeepRec、TorchRec、DeepSpeed 等任一训练平台或组件优先 3. 熟悉主流推理服务框架及 backend,如 TensorFlow Serving、TensorRT、ONNX Runtime、XLA 等,具备模型部署与调优经验 4. 了解推荐领域常见模型(如 DeepFM、DIN、SIM 等)与完整的样本生成、训练、上线、推理、特征服务流程优先
工作职责
岗位职责 1. 参与模型训练与推理引擎的架构设计与核心模块开发,基于 TensorFlow / PyTorch 打造业界领先的训练-推理引擎,支撑长序列建模与生成式推荐等新一代模型架构升级 2. 对接存储与数据平台团队,搭建统一的 ML 数据 Pipeline,提供特征管理、开发调试、版本控制与高效生产等平台化能力 3. 负责训推基础设施核心模块的研发与性能优化,包括但不限于 Embedding 管理组件、特征 DSL 引擎、服务化调度与推理框架 4. 关注业界前沿 LLM / Agent 等模型与系统架构,探索其在搜索与推荐业务中的工程化落地方案
包括英文材料
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
编程规范+
[英文] Google Style Guides
https://google.github.io/styleguide/
Every major open-source project has its own style guide: a set of conventions (sometimes arbitrary) about how to write code for that project. It is much easier to understand a large codebase when all the code in it is in a consistent style.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
Transformer+
https://huggingface.co/learn/llm-course/en/chapter1/4
Breaking down how Large Language Models work, visualizing how data flows through.
https://poloclub.github.io/transformer-explainer/
An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.
https://www.youtube.com/watch?v=wjZofJX0v4M
Breaking down how Large Language Models work, visualizing how data flows through.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
Ray+
https://github.com/ray-project/ray
Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://www.youtube.com/watch?v=FhXfEXUUQp0
In this video, I'll teach you everything you need to know about Apache Ray!
https://www.youtube.com/watch?v=fMiAyj2kgac
Using powerful machine learning algorithms is easy using Ray.io and Python.
https://www.youtube.com/watch?v=q_aTbb7XeL4
Parallel and Distributed computing sounds scary until you try this fantastic Python library.
TensorRT+
https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/quick-start-guide.html
This TensorRT Quick Start Guide is a starting point for developers who want to try out the TensorRT SDK; specifically, it demonstrates how to quickly construct an application to run inference on a TensorRT engine.
ONNX+
https://github.com/onnx/tutorials
Open Neural Network Exchange (ONNX) is an open standard format for representing machine learning models.
[英文] Introduction to ONNX
https://onnx.ai/onnx/intro/
This documentation describes the ONNX concepts (Open Neural Network Exchange).
DeepFM+
https://d2l.ai/chapter_recommender-systems/deepfm.html
DeepFM consists of an FM component and a deep component which are integrated in a parallel structure.
https://deeprs-tutorial.github.io/WWW_DNN.pdf
相关职位
校招引擎
1、参与模型训练与推理引擎的架构设计与核心模块开发,基于 TensorFlow / PyTorch 打造业界领先的训练-推理引擎,支撑长序列建模与生成式推荐等新一代模型架构升级; 2、对接存储与数据平台团队,搭建统一的 ML 数据 Pipeline,提供特征管理、开发调试、版本控制与高效生产等平台化能力; 3、负责训推基础设施核心模块的研发与性能优化,包括但不限于 Embedding 管理组件、特征 DSL 引擎、服务化调度与推理框架; 4、关注业界前沿 LLM / Agent 等模型与系统架构,探索其在搜索与推荐业务中的工程化落地方案。
更新于 2025-08-30
校招引擎
1. 主导新一代训练与推理引擎的架构设计与核心模块开发,支撑搜广推业务在长序列建模、生成式推荐、Agent 等前沿场景的规模落地。 2. 与存储、数据平台深度协同,打造端到端 ML 数据 Pipeline:统一特征管理、秒级调试、版本追踪与一键上线,让数据科学家专注模型创新。 3. 持续优化训推基础设施:自研 Embedding 高速存储、特征 DSL 引擎、弹性调度与服务化推理框架,实现 10x 级性能提升。 4. 跟踪 LLM / Agent 最新进展,将其工程化落地到搜索、广告、推荐及智能体业务,定义行业新标准。
更新于 2025-09-04
社招核心本地商业-业
1. 负责到家搜索推荐机器学习引擎的工程架构工作,包括 CTR/LLM 模型训练/推理优化、 用户特征平台建设等; 2. 建设面向多场景、高性能、可拓展的机器学习引擎,支撑外卖/闪购/医药/营销等场景的搜索推荐业务需求; 3. 持续优化工程架构,提升系统性能表现、算力规模以及迭代效率; 4. 调研业界前沿技术发展动态,结合业务实际情况,实现在业务的落地
更新于 2025-04-17