新浪微博大模型算法工程师
社招全职1-3年新浪&微博地点:北京状态:招聘
任职要求
1.计算机、数学或相关专业本科及以上学历; 2.1~3年相关工作经验,具备扎实的工程开发能力; 3.熟悉至少一门后端语言(Java/Go),有实际项目开发经验; 4.熟悉大数据处理框架,如 Hive、Flink,具备海量数据处理经验; 5.了解大模型相关技术,包括但不限于大模型训练、微调、推理优化等; 6.有高并发、分布式系统开发经验者优先; 7.具备良好的沟通能力和团队合作精神,能够快速适应业务变化。 加分项(非必须): 1.有AIGC产品落地经验者优先; 2.有开源项目贡献经历者优先; 3.熟悉LangChain、LlamaIndex、GraphRAG等RAG框架者优先; 4.熟悉PyTorch/TensorFlow等深度学习框架者优先。
工作职责
1.参与微博大模型相关系统的研发与工程化落地,推动AIGC技术在内容理解、内容生成等核心业务场景的应用; 2.负责基于大规模数据的模型训练、推理优化及服务部署,提升系统性能与稳定性; 3.主导或参与高并发、低延迟场景下的后端系统设计与开发,保障平台高效运行; 4.结合业务需求,推动Prompt工程、RAG系统等技术在实际产品中的应用; 5.参与AIGC功能的实现与上线,提升用户互动体验; 6.与产品、运营团队紧密配合,完成业务策略的建模与落地,持续优化效果。
包括英文材料
学历+
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Go+
https://www.youtube.com/watch?v=8uiZC0l4Ajw
学习Golang的完整教程!从开始到结束不到一个小时,包括如何在Go中构建API的完整演示。没有多余的内容,只有你需要知道的知识。
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
高并发+
https://www.baeldung.com/concurrency-principles-patterns
In this tutorial, we’ll discuss some of the design principles and patterns that have been established over time to build highly concurrent applications.
https://www.baeldung.com/java-concurrency
Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.
https://www.oreilly.com/library/view/concurrency-in-go/9781491941294/
You’ll understand how Go chooses to model concurrency, what issues arise from this model, and how you can compose primitives within this model to solve problems.
https://www.oreilly.com/library/view/modern-concurrency-in/9781098165406/
With this book, you'll explore the transformative world of Java 21's key feature: virtual threads.
https://www.youtube.com/watch?v=qyM8Pi1KiiM
https://www.youtube.com/watch?v=wEsPL50Uiyo
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
LangChain+
https://python.langchain.com/docs/tutorials/
New to LangChain or LLM app development in general? Read this material to quickly get up and running building your first applications.
https://www.freecodecamp.org/news/beginners-guide-to-langchain/
LangChain is a popular framework for creating LLM-powered apps.
LlamaIndex+
https://developers.llamaindex.ai/python/framework/getting_started/starter_example/
This tutorial will show you how to get started building agents with LlamaIndex.
https://www.ibm.com/think/tutorials/llamaindex-rag
LlamaIndex is a powerful open source framework that simplifies the process of building RAG pipelines.
RAG+
https://www.youtube.com/watch?v=sVcwVQRHIc8
Learn how to implement RAG (Retrieval Augmented Generation) from scratch, straight from a LangChain software engineer.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
相关职位
社招1年以上算法开发岗
1、参与生成式大模型能力构建;不局限于模型设计、prompt优化、预训练、模型推理加速、其他能力建设等; 2、采用最先进的并行处理和分布式学习技术,制定并执行性能优化策略,显著提升大型语言模型的训练速度和推理能力,例如跟进DeepSeek R1技术架构等,确保技术行业领先; 3、推进大模型技术在京东物流各个业务场景落地,包括不限于智能问答、智能数据分析、智能决策以及Computer Use等,助力业务流程优化,增质提效; 4、深度探索大语言模型方向,保持技术领先优势,推动京东物流在行业内树立高效、精准的大模型/多模态大模型应用标杆,并取得业务收益。
更新于 2025-06-09
社招大模型
1、探索新一代大语言模型基座架构,完成扩散模型(diffusion model)在大语言模型的重塑,突破逐个token预测的方式,实现高效的推理模式,探索全新scaling law; 2、实现大模型训练的数据清洗、合成和评估;设计和实现大模型训练的AI Infra框架。
更新于 2025-09-05