
地平线视觉深度学习算法实习生(VLA静态要素理解方向)
校招全职算法序列地点:北京 | 上海状态:招聘
任职要求
1、计算机视觉、模式识别、机器学习、电子信息、机器人等相关专业的硕士/博士在读; 2、熟悉主流深度学习算法,精通一/多个领域,包括但不限于目标检测、分割、跟踪、多任务学习、立体视觉等领域,有计算机视觉、模式识别领域顶会;(CVPR/ICCV/ECCV/ICML/NeurIPS)或顶刊(TPAMI/IJCV/TIP)作品者优先;顶级学术…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、探索vla模型在复杂路口的场景理解能力以及对下游决策模块的提升; 2、负责核心算法或模型的原创设计以及工程化落地,如模型优化、评测体系化构建、case迭代等; 3、掌握数据挖掘、标注、训练、部署及badcase回归等闭环链路,并伴随业务开展持续优化; 4、掌握通过数据闭环持续迭代模型的能力;
包括英文材料
OpenCV+
https://learnopencv.com/getting-started-with-opencv/
At LearnOpenCV we are on a mission to educate the global workforce in computer vision and AI.
https://opencv.org/university/free-opencv-course/
This free OpenCV course will teach you how to manipulate images and videos, and detect objects and faces, among other exciting topics in just about 3 hours.
模式识别+
https://www.mathworks.com/discovery/pattern-recognition.html
Pattern recognition is the process of classifying input data into objects, classes, or categories using computer algorithms based on key features or regularities.
https://www.microsoft.com/en-us/research/wp-content/uploads/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science.
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
CVPR+
https://cvpr.thecvf.com/
ICCV+
https://iccv.thecvf.com/
ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials.
ECCV+
https://eccv.ecva.net/
ECCV is the official event under the European Computer Vision Association and is biannual on even numbered years.
ICML+
https://icml.cc/
还有更多 •••
相关职位

实习算法序列
1、探索vla模型在复杂路口的场景理解能力以及对下游决策模块的提升; 2、负责核心算法或模型的原创设计以及工程化落地,如模型优化、评测体系化构建、case迭代等; 3、掌握数据挖掘、标注、训练、部署及badcase回归等闭环链路,并伴随业务开展持续优化; 4、掌握通过数据闭环持续迭代模型的能力;
更新于 2025-09-28北京|上海
实习核心本地商业-基
深度参与具身智能“感知-决策-行动”技术全链路的技术攻关,在以下一个或多个方向上进行深入研究: 1. 感知与决策规划:提升多模态大模型在具身场景下的能力表现,包括在复杂动态环境中对环境的空间理解,以及对复合任务的任务步骤拆解与任务状态判断能力。 2. 行动与控制:基于真机示教数据、增广数据以及海量互联网视频数据,训练机器人掌握高精度的操作技能。在机器人上研究并实践强化学习算法,优化机器人的动作策略,提升其在物理世界中的动作鲁棒性和技能泛化。 3. 数据增广:通过仿真和world model等方式对真机遥操数据进行规模化增广,探索如何解决真机遥操数据的稀缺问题。
更新于 2025-12-25北京|上海
实习
1. VLA/VLN算法开发:研究并实现Vision-Language-Action (VLA) / Vision-Language Navigation(VLN)算法,使机器人能够根据自然语言指令以及当前场景进行自主移动; 2. 多模态融合:开发视觉、语言、地图等多模态信息融合模块,提升导航决策的准确性; 3. 场景理解:实现基于视觉和语言的场景语义理解,支持复杂环境下的目标定位与路径规划; 4. 模型训练与优化:负责VLA/VLN模型的训练、调优及推理性能优化; 5. 数据与评测:参与导航数据集构建、评测指标设计及Benchmark开发。
更新于 2025-12-01深圳