小米多模态算法工程师
社招全职3年以上A47316地点:北京状态:招聘
任职要求
1. 计算机或者相关专业本科以上学历三年以上,学习沟通理解能力强,积极乐观,能吃苦,有对技术的强烈追求; 2. 熟练掌握深度学习框架,如TensorFlow、PyTorch等。 3. 熟悉大模型的训练和优化方法,包括预训练模型的微调、模型压缩等技术。 4. 熟悉自然语言处理(NLP)和计算机视觉(CV)领域的常见任务和模型架构。 5. 具备良好的编程能力,熟练掌握Python等编程语言 6. 有优秀的逻辑思维能力和数据分析能力,善于分析和解决问题,良好的沟通能力与团队协作能力; 7. 满足以下一个或多个条件的,优先考虑: - 在国际顶尖会议或期刊(包括但不限于 CVPR, ICCV, ECCV, NIPS, ICML, AAAI, TPAMI, IJCV、EMNLP、SIGIR、WSDM、SIGKDD 等)上发表过论文; - 有较强的代码能力或有较强的比赛经验者优先; - 计算机或者图像处理、模式识别、机器学习相关专业,重点实验室的博士/硕士优先; - 有较丰富的相关经验者优先,比如有一年以上在人工智能领域公司进行计算机视觉&机器学习方面实习的经验,或来自国内外计算机视觉/机器学习/计算机图形学/数据挖掘等领域内知名实验室。
工作职责
我们正在寻找一位经验丰富、富有创新精神的深度学习/大模型研究员/工程师,加入我们快速发展的团队。我们的业务主要聚焦于小米汽车、小米手机相关方向的算法开发工作,为公司产品和服务提供强大的人工智能支持。在这里,您将与顶尖的科学家和工程师合作,共同探索和实现AI和大模型技术的突破,为公司创造更大的价值。 岗位描述 1. 负责大模型的架构设计、开发和优化,提升模型的性能和效率。 2. 研究和应用最新的大模型技术,包括但不限于预训练模型、微调技术、模型压缩等。 3. 参与大模型的训练和调优工作,确保模型在不同任务和数据集上的表现达到最优。 4. 与团队合作,将大模型技术应用于公司的产品和服务中,提升用户体验和业务价值。 5. 撰写技术文档和研究报告,分享研究成果和实践经验。 6. 跟踪行业动态和学术前沿,及时将新技术引入到公司的大模型研发中。 7. 基于业务交付目标,进行性能/耗时等优化,完成全流程工程整合、测试与部署。
包括英文材料
学历+
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
NLP+
https://www.youtube.com/watch?v=fNxaJsNG3-s&list=PLQY2H8rRoyvzDbLUZkbudP-MFQZwNmU4S
Welcome to Zero to Hero for Natural Language Processing using TensorFlow!
https://www.youtube.com/watch?v=R-AG4-qZs1A&list=PLeo1K3hjS3uuvuAXhYjV2lMEShq2UYSwX
Natural Language Processing tutorial for beginners series in Python.
https://www.youtube.com/watch?v=rmVRLeJRkl4&list=PLoROMvodv4rMFqRtEuo6SGjY4XbRIVRd4
The foundations of the effective modern methods for deep learning applied to NLP.
OpenCV+
https://learnopencv.com/getting-started-with-opencv/
At LearnOpenCV we are on a mission to educate the global workforce in computer vision and AI.
https://opencv.org/university/free-opencv-course/
This free OpenCV course will teach you how to manipulate images and videos, and detect objects and faces, among other exciting topics in just about 3 hours.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
CVPR+
https://cvpr.thecvf.com/
ICCV+
https://iccv.thecvf.com/
ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials.
ECCV+
https://eccv.ecva.net/
ECCV is the official event under the European Computer Vision Association and is biannual on even numbered years.
ICML+
https://icml.cc/
WSDM+
https://www.wsdm-conference.org/
图像处理+
https://opencv.org/blog/computer-vision-and-image-processing/
This fascinating journey involves two key fields: Computer Vision and Image Processing.
https://www.geeksforgeeks.org/python/image-processing-in-python/
Image processing involves analyzing and modifying digital images using computer algorithms.
https://www.youtube.com/watch?v=kSqxn6zGE0c
In this Introduction to Image Processing with Python, kaggle grandmaster Rob Mulla shows how to work with image data in python!
模式识别+
https://www.mathworks.com/discovery/pattern-recognition.html
Pattern recognition is the process of classifying input data into objects, classes, or categories using computer algorithms based on key features or regularities.
https://www.microsoft.com/en-us/research/wp-content/uploads/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science.
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
数据挖掘+
https://www.youtube.com/watch?v=-bSkREem8dM
Database vs Data Warehouse vs Data Lake
https://www.youtube.com/watch?v=7rs0i-9nOjo
相关职位
社招3年以上核心本地商业-点
1.利用计算机视觉和人工智能技术,改进点评笔记、评价等多个点评创作产品的创作体验。 2.参与研发以计算机视觉和人工智能技术为核心的创新型内容玩法。 3.探索MLLM,LLM,VLM等相关算法前沿,应用于业务并解决实际业务问题。 4.负责对业务场景下相关技术问题进行分析、算法设计和上线,全面参与并推动各环节的高效运行,以实现持续的业务价值提升。 5.研究并开发智能Agent系统,优化用户与AI系统的交互体验。
更新于 2025-04-17
社招3-5年算法开发岗
1. 构建基于计算机视觉 + VLM/MLLM 的容器与商品语义理解体系,融合图像、点云与文本信息,提高复杂 SKU 识别与定位鲁棒性。 2. 设计检测/分割 + 3D 点云融合网络,实现多品混放场景的实例分割与 6D 抓取点预测。 3. 对 LLaVA、Qwen2-VL、InternVL2.5 等多模态大模型进行指令微调,支持机器人自然语言任务下达与动态规划。 4. 负责相机、雷达联合标定,多传感器融合(RGB-D + 点云 + 力矩传感器)。 5. 搭建自动标注与主动学习流水线,建设数据飞轮。 6. 关注行业最新多模态技术,快速验证并落地仓储场景。
更新于 2025-06-10
社招TEG技术
1.负责垂直场景多模态大模型研发,包括图文、视频、音频等多个模态的预训练和SFT训练,探索合成数据在多模态训练上的应用; 2.负责大模型安全、内容治理、电商等多场景业务的内容理解,包括多模态表征、图文/视频意图理解、相同/相似判断、自动问答等; 3.负责跟踪和研究大模型前沿问题,并应用于解决实际的业务痛点。
更新于 2025-04-16