百度文小言-大模型应用算法工程师-2026AIDU-AI产品创新部(J86386)
校招全职AIDU项目地点:北京状态:招聘
任职要求
-计算机/数学相关专业硕士及以上学历,有相关互联网算法研发经验; -扎实的机器学习基础,精通Transformer、BERT、GPT等模型架构,有LLM微调、RLHF实战经验者优先; -熟练掌握Python/C++,具备大规模分布式训练经验,熟悉PyTorch框架; -具备完整的数据闭环能力:从特征工程、离线训练到在线服务全链路开发经验; -对AIGC产品有深刻理解,熟悉RAG、Agent等前沿技术落地场景,有千万级用户产品经验者优先; -出色的业务洞察力,能通过数据挖掘发现算法优化机会,主导过完整项目迭代者优先; -良好的沟通协作能力、独立分析和解决问题的能力,善于探索和应用AI领域前沿技术,推进技术进步。
工作职责
-负责大模型应用层算法研发与调优,负责对话系统、内容生成、意图理解等核心模块的算法优化,基于LLM深入理解用户所需,提升模型在复杂场景下的推理能力与用户体验; -构建用户-内容动态匹配算法,开发结合大模型能力的个性化推荐系统;研发文本/语音/视觉多模态融合算法,探索新型人机交互范式在移动端的最佳实践,带动产品规模高速增长。
包括英文材料
学历+
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
Transformer+
https://huggingface.co/learn/llm-course/en/chapter1/4
Breaking down how Large Language Models work, visualizing how data flows through.
https://poloclub.github.io/transformer-explainer/
An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.
https://www.youtube.com/watch?v=wjZofJX0v4M
Breaking down how Large Language Models work, visualizing how data flows through.
BERT+
https://www.youtube.com/watch?v=xI0HHN5XKDo
Understand the BERT Transformer in and out.
GPT+
https://www.youtube.com/watch?v=kCc8FmEb1nY
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
特征工程+
https://www.ibm.com/think/topics/feature-engineering
Feature engineering preprocesses raw data into a machine-readable format. It optimizes ML model performance by transforming and selecting relevant features.
https://www.kaggle.com/learn/feature-engineering
Better features make better models. Discover how to get the most out of your data.
RAG+
https://www.youtube.com/watch?v=sVcwVQRHIc8
Learn how to implement RAG (Retrieval Augmented Generation) from scratch, straight from a LangChain software engineer.
AI agent+
https://www.ibm.com/think/ai-agents
Your one-stop resource for gaining in-depth knowledge and hands-on applications of AI agents.
数据挖掘+
https://www.youtube.com/watch?v=-bSkREem8dM
Database vs Data Warehouse vs Data Lake
https://www.youtube.com/watch?v=7rs0i-9nOjo
相关职位
校招AIDU项目
-负责大模型应用层算法研发与调优,负责对话系统、内容生成、意图理解等核心模块的算法优化,基于LLM深入理解用户所需,提升模型在复杂场景下的推理能力与用户体验; -构建用户-内容动态匹配算法,开发结合大模型能力的个性化推荐系统;研发文本/语音/视觉多模态融合算法,探索新型人机交互范式在移动端的最佳实践,带动产品规模高速增长。
更新于 2025-05-19
社招MEG
-负责百度文小言大模型策略算法研发,提升产品效果和用户体验 -语言模型及应用:基于领域微调和强化学习的大语言模型训练和优化,提升模型推理和满足能力,探索其在不同应用场景中的潜力 -用户需求和行为:分析海量用户数据,深入理解用户行为和用户意图,为系统决策提供有力的数据支撑 -多模态技术:端到端多模态大模型能力探索,通过文本、语音、视觉融合理解实现移动端新型人机交互范式
更新于 2025-05-19
实习
1.学习并参与对话系统、大模型前沿方向的数据运营工作,如意图分类、实体识别、文档摘要、角色扮演、AI搜索、文生图等,重点负责模型的数据集制作和标签管理。了解业务流程和算法流程,完成复杂标注任务的设计,在此基础上对原始数据进行处理,包括数据的爬取、筛选和清洗,并制定标注标准。 2.培训海外标注团队,进行数据质检,把控标注质量,确保数据的准确性和一致性。 3.按项目要求对模型进行评测和效果分析,完成复杂的数据分析任务,及时反馈问题,根据模型效果调整数据和标注策略。
更新于 2025-07-02