蔚来实习-强化学习算法工程师
实习兼职算法地点:上海状态:招聘
任职要求
1. 人工智能、机器人、多模态大模型等相关领域的硕士或博士; 2. 熟悉机器人感知系统(深度相机、单目/双目视觉等SLAM)原理和算法; 3. 熟悉多自由度机器人运动学、动力学建模及控制理论(如阻抗控制、操作空间控制); 4. 熟悉强化学习经典算法(PPO, SAC, DDPG, TD3等)及框架; 5. 具备物理仿真工具(Isaac Gym, PyBullet, Gazebo)开发经验,有实际Sim2Real经验(如域随机化、自适应策略)优先; 6. 有机器人抓取、导航、灵巧操作等RL算法落地项目经验,或顶级会议(ICRA, IROS, CoRL, NeurIPS)论文发表者优先; 7. 优秀的分析、解决问题能力,具备良好的团队协作素质。
工作职责
1. 结合业务需求,对多自由度机器人产品(双足、四足、灵巧手)进行选型、设计与集成; 2. 结合视觉(RGB-D)、力觉、触觉等输入,设计端到端RL策略,实现环境交互的闭环控制; 3. 设计并实现面向多自由度机器人的强化学习解决方案,实现对复杂物理环境、复杂任务的自适应决策和控制; 4. 基于物理引擎(如Isaac Gym, PyBullet, MuJoCo)搭建高保真机器人仿真环境,支持训练和Sim2Real迁移; 5. 模型部署与验证,解决真机运行中的工程问题,并优化训练算法,通过动态干扰、随机参数等方式提升鲁棒性; 6. 跟踪行业动态,如世界模型(World Models)、分层强化学习(HRL)、多智能体强化学习(MARL) 等前沿方向,进行评测和落地,优化机器人在复杂任务中的应用;
包括英文材料
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
SLAM+
https://docs.mrpt.org/reference/latest/tutorial-slam-for-beginners-the-basics.html
[英文] SLAM for Dummies
https://dspace.mit.edu/bitstream/handle/1721.1/119149/16-412j-spring-2005/contents/projects/1aslam_blas_repo.pdf
A Tutorial Approach to Simultaneous Localization and Mapping
https://ouster.com/insights/blog/introduction-to-slam-simultaneous-localization-and-mapping
SLAM is an essential piece in robotics that helps robots to estimate their pose – the position and orientation – on the map while creating the map of the environment to carry out autonomous activities.
[英文] What Is SLAM?
https://www.mathworks.com/discovery/slam.html
How it works, types of SLAM algorithms, and getting started
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
强化学习+
https://cloud.google.com/discover/what-is-reinforcement-learning?hl=en
Reinforcement learning (RL) is a type of machine learning where an "agent" learns optimal behavior through interaction with its environment.
https://huggingface.co/learn/deep-rl-course/unit0/introduction
This course will teach you about Deep Reinforcement Learning from beginner to expert. It’s completely free and open-source!
https://www.kaggle.com/learn/intro-to-game-ai-and-reinforcement-learning
Build your own video game bots, using classic and cutting-edge algorithms.
Gymnasium+
https://gymnasium.farama.org/index.html
An API standard for reinforcement learning with a diverse collection of reference environments
https://www.youtube.com/watch?v=FvuyrpzvwdI
Learn to use Gymnasium for Python, which allows you to create environments to run reinforcement learning programs against in Python.
Gazebo+
https://gazebosim.org/docs/latest/getstarted/
When you’re ready, follow the next few steps to get up and running with simulation using Gazebo.
https://www.youtube.com/watch?v=laWn7_cj434
In this video we learn how to simulate our robot using Gazebo.
NeurIPS+
https://neurips.cc/
相关职位
实习
1. 开发基于强化/模仿学习的机器人行走及全身控制策略; 2. 开发复杂地形下基于视觉的强化学习行走策略; 3. 负责算法策略的训练与移植部署,实现算法sim-to-real在机器人实机上落地应用; 4. 持续跟踪国内外前沿研究成果,并进行相关算法复现; 5. 编写相关技术文档,推动团队技术沉淀与知识共享。
更新于 2025-08-20
实习程序&技术类
负责视频生成模型在后训练/强化学习阶段的算法研发与模型训练,运用前沿强化学习算法优化模型的稳定性与视频生成质量,并实现对人类专家审美偏好的深度对齐。 核心职责 1、深入研究前沿强化学习算法,负责视频生成任务中的基于强化学习的优化方案探索及训练框架搭建; 2、结合视频生成模型的效果短板,分析强化学习算法的优化目标,设计数据收集方案; 3、设计并实现面向视频生成的多目标强化学习算法,设计并训练奖励模型(Reward Model); 4、撰写高质量技术报告与论文,与团队共同推动技术创新,保持行业领先地位。
实习无人机业务部
1. 开发和优化基于多传感器(LiDAR/Camera/IMU/GNSS等)融合的标定、定位、建图和环境感知算法 2. 开发和优化全局规划器和局部规划器算法,提升机器人在复杂场景中的运行效率和稳定性; 3. 使用Sim2Real技术提升算法性能。
更新于 2025-07-15