百度文心一言-大模型算法工程师-2026AIDU-自然语言处理部(J86391)
校招全职AIDU项目地点:北京 | 深圳状态:招聘
任职要求
-具备机器学习/深度学习/自然语言处理/计算机视觉视觉的理论背景和实践经验; -熟练掌握Python编程语言以及Pytorch、Tensorflow、PaddlePaddle等其中一项深度学习开发框架; -具有较好的团队沟通合作能力、分析问题和解决问题的能力; -熟练掌握预训练模型的算法原理和实现细节、有大规模模型预训练实践经验、发表相关顶会论文者优先; -熟练掌握 Hadoop、Spark 等大数据处理框架。 加分项: -有较强的竞赛成绩,获得过 ACM,NOI,NOIP 或其他商业代码竞赛的任意奖项; -有较强的学术比赛经验或者在知名数据集的 Leaderboard 上排名靠前; -有较强的代码能力,有高质量的中大型项目或个人开源项目的经验; -有较强的钻研精神,对部分语言、系统、算法有深刻的探索和理解。
工作职责
-参与大规模预训练模型(文本、图像、视频)的研发工作; -探索高效的模型调优策略、高质数据建设方法,研究大模型前瞻技术和趋势; -设计、实现、优化分布式系统和并行计算框架,提升训练和推理效率; -支持大模型平台化及创新应用落地。
包括英文材料
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
NLP+
https://www.youtube.com/watch?v=fNxaJsNG3-s&list=PLQY2H8rRoyvzDbLUZkbudP-MFQZwNmU4S
Welcome to Zero to Hero for Natural Language Processing using TensorFlow!
https://www.youtube.com/watch?v=R-AG4-qZs1A&list=PLeo1K3hjS3uuvuAXhYjV2lMEShq2UYSwX
Natural Language Processing tutorial for beginners series in Python.
https://www.youtube.com/watch?v=rmVRLeJRkl4&list=PLoROMvodv4rMFqRtEuo6SGjY4XbRIVRd4
The foundations of the effective modern methods for deep learning applied to NLP.
OpenCV+
https://learnopencv.com/getting-started-with-opencv/
At LearnOpenCV we are on a mission to educate the global workforce in computer vision and AI.
https://opencv.org/university/free-opencv-course/
This free OpenCV course will teach you how to manipulate images and videos, and detect objects and faces, among other exciting topics in just about 3 hours.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
PaddlePaddle+
https://learnopencv.com/paddlepaddle/
PaddlePaddle (PArallel Distributed Deep LEarning) is an open-source deep learning framework released by Baidu in 2016.
https://www.paddlepaddle.org.cn/tutorials
本课程采用飞桨特色的「横纵式」 教学法,从易到难,学习难度逐层递进,并结合图形和案例进行讲解,力求让刚接触深度学习的读者可以快速理解。
开发框架+
[英文] Understanding Modern Development Frameworks: A Guide for Developers and Technical Decision-makers
https://www.freecodecamp.org/news/understanding-modern-development-frameworks-guide-for-devs/
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
相关职位
校招AIDU项目
-参与大规模预训练模型(文本、图像、视频)的研发工作; -探索高效的模型调优策略、高质数据建设方法,研究大模型前瞻技术和趋势; -设计、实现、优化分布式系统和并行计算框架,提升训练和推理效率; -支持大模型平台化及创新应用落地。
更新于 2025-05-19

社招3年以上
1.负责公司大模型应用研究,探索大模型利用企业特定知识解决领域问题的可行模式,指导团队完成大模型在各场景的应用落地。 2.针对问题场景,建设领域可用大模型,并建立模型友好的数据标准,持续推动企业数据的高效使用。 3.和工程团队配合,完成大模型在生产环境的工程化落地。 4.达成大模型生产化的业务效果,包括智能对话,旅行规划,产品推荐,多模态内容生成等方面,创造业务价值。
更新于 2024-05-30
实习
1、跟踪大模型算法领域前沿技术动态,调研并梳理输出技术分析要点; 2、结合具体业务场景需求,推进大模型的训练、微调及参数优化,提升模型适配性; 3、针对模型应用或业务落地中的问题精准分析,提出切实可行的解决方案并参与论证; 4、全流程跟进业务落地进度与实际效果,监测算法表现并持续迭代优化策略。
更新于 2025-09-03