百度视觉-多模态生成算法工程师-2026AIDU(J85298)
校招全职AIDU项目地点:北京 | 上海 | 深圳状态:招聘
任职要求
-有推进人工智能的理想和使命感 ; -计算机科学、电子工程、数学等相关专业硕士及以上学历,面向26届毕业生; -熟悉 Python 编程语言,熟悉PyTorch/Paddle/TensorFlow/MXNet等框架之一; -熟悉常见的计算机视觉生成算法,对图像生成、图像编辑、风格迁移、视频生成、3D生成、数字人等方向有浓厚兴趣,并有一定的项目经验; -具备良好的学习能力和团队合作精神,能够积极主动地解决问题; -在CVPR、ICCV、ECCV、ICML、NeurIPS、COLT等计算机视觉、机器学习学术会议或期刊以第一作者发表过文章、有丰富项目经验优先。
工作职责
-参与图像生成、图像编辑、风格迁移、视频生成、3D生成、数字人等视觉生成算法的研究与开发; -阅读相关领域论文,复现和改进现有算法; -参与算法模型的训练、优化和部署; -与团队成员紧密合作,共同推进项目进展。
包括英文材料
学历+
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
PaddlePaddle+
https://learnopencv.com/paddlepaddle/
PaddlePaddle (PArallel Distributed Deep LEarning) is an open-source deep learning framework released by Baidu in 2016.
https://www.paddlepaddle.org.cn/tutorials
本课程采用飞桨特色的「横纵式」 教学法,从易到难,学习难度逐层递进,并结合图形和案例进行讲解,力求让刚接触深度学习的读者可以快速理解。
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
OpenCV+
https://learnopencv.com/getting-started-with-opencv/
At LearnOpenCV we are on a mission to educate the global workforce in computer vision and AI.
https://opencv.org/university/free-opencv-course/
This free OpenCV course will teach you how to manipulate images and videos, and detect objects and faces, among other exciting topics in just about 3 hours.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
CVPR+
https://cvpr.thecvf.com/
ICCV+
https://iccv.thecvf.com/
ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials.
ECCV+
https://eccv.ecva.net/
ECCV is the official event under the European Computer Vision Association and is biannual on even numbered years.
ICML+
https://icml.cc/
NeurIPS+
https://neurips.cc/
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
相关职位
校招AIDU项目
多模态传感器融合感知端到端模型研发: -基于摄像头、激光雷达、3D/4D毫米波雷达等多模态传感器设计与开发融合感知模型与算法(包含但不限于:障碍物检测、OCC(Occupancy Network)、场景语义分割、跟踪等任务),提升在复杂场景、极端场景下的感知能力; -构建覆盖Corner Case的自动化数据采集与标注系统,开发数据质量评估体系,建立数据-模型迭代闭环机制; -通过自监督、弱监督学习提升模型泛化能力,加速数据飞轮,探索VLM、VLA等技术在数据飞轮中的实践与应用; -轻图/无图模型研发; -基于多模态传感器设计与实现轻图、无图模型,实现L4下的轻图实时生成,包含拓补信息、各种道路属性等的实时生成,为L4大规模应用提供基础道路感知能力; -构建轻图对应的数据闭环与数据飞轮,如挖掘算法、难例模拟生成方式、轻图适用的仿真系统等设计与实现。 世界模型研发: -设计基于多模态传感器的世界模型,为复杂问题解决效果验证、端到端模型验证提供强有力的仿真验证能力与感知能力; -构建为实现世界模型需要的数据闭环与数据飞轮,如数据采集、生成、自动化标注等相关强算法问题解决。
更新于 2025-05-19
校招AIDU项目
-负责大模型应用层算法研发与调优,负责对话系统、内容生成、意图理解等核心模块的算法优化,基于LLM深入理解用户所需,提升模型在复杂场景下的推理能力与用户体验; -构建用户-内容动态匹配算法,开发结合大模型能力的个性化推荐系统;研发文本/语音/视觉多模态融合算法,探索新型人机交互范式在移动端的最佳实践,带动产品规模高速增长。
更新于 2025-06-23