拼多多多模态算法工程师(cv/多模态大模型)
社招全职3-5年技术类地点:上海状态:招聘
任职要求
1. 有扎实多模态,cv的算法背景,熟练掌握CNN、VIT架构的图像编码技术原理,熟悉pytorch等深度学习框架。 2. 在商品理解、内容理解和内容结构化等多模态领域有3-5年及以上工作经验,具备业务思维,能拆解业务问题到算法解决方案,有一定技术管理经验。 3. 有在复杂业务处理过大规模数据量和高并发、高吞吐的算法模型serving经验。 加分项 1. 有在Clip/Blip/Internvl 等图-文预训练大模型上微调,应用的经验,有相关开源项目经验优先。 2. 有在CVPR/NeurIPS/ECCV等会议或者期刊发布过论文者优先。
工作职责
1. 深入研究和探索多模态算法在跨境电商的使用场景和应用落地,设计并实现基于深度学习、多模态大模型的电商多模态模型,用于商品理解、属性识别、AIGC等关键任务和场景。 2. 与工程团队紧密合作,将算法模型进行工程化落地,优化业务流程和效果,为业务发展提供强有力的算法支持。
包括英文材料
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
CNN+
https://learnopencv.com/understanding-convolutional-neural-networks-cnn/
Convolutional Neural Network (CNN) forms the basis of computer vision and image processing.
[英文] CNN Explainer
https://poloclub.github.io/cnn-explainer/
Learn Convolutional Neural Network (CNN) in your browser!
https://www.deeplearningbook.org/contents/convnets.html
Convolutional networks(LeCun, 1989), also known as convolutional neuralnetworks, or CNNs, are a specialized kind of neural network for processing data.
https://www.youtube.com/watch?v=2xqkSUhmmXU
MIT Introduction to Deep Learning 6.S191: Lecture 3 Convolutional Neural Networks for Computer Vision
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
高并发+
https://www.baeldung.com/concurrency-principles-patterns
In this tutorial, we’ll discuss some of the design principles and patterns that have been established over time to build highly concurrent applications.
https://www.baeldung.com/java-concurrency
Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.
https://www.oreilly.com/library/view/concurrency-in-go/9781491941294/
You’ll understand how Go chooses to model concurrency, what issues arise from this model, and how you can compose primitives within this model to solve problems.
https://www.oreilly.com/library/view/modern-concurrency-in/9781098165406/
With this book, you'll explore the transformative world of Java 21's key feature: virtual threads.
https://www.youtube.com/watch?v=qyM8Pi1KiiM
https://www.youtube.com/watch?v=wEsPL50Uiyo
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
CVPR+
https://cvpr.thecvf.com/
NeurIPS+
https://neurips.cc/
ECCV+
https://eccv.ecva.net/
ECCV is the official event under the European Computer Vision Association and is biannual on even numbered years.
相关职位
实习淘天集团研究型实
我们是阿里妈妈搜索广告算法团队,负责淘宝搜索场景广告技术变现的算法设计和优化,包括并不限于: 1. 负责研究多模态大模型在淘宝海量图文、视频物料理解上的运用; 2. 负责研究生成式大模型/AIGC算法在广告投放物料挖掘上的运用; 3. 负责研究多模态大模型和生成式大模型在搜索广告中的全链路运用和升级; 4. 负责研究搜索广告场景下多物料投放算法的设计和优化,包含商品、直播、短视频等; 5. 负责研究超大规模多模态大模型的训练和推理加速; 6. 负责研究经典CV/多模态任务的设计和优化,包括分类、检测、OCR、度量学习等。
更新于 2025-08-08
社招2年以上
1. 负责将多模态大模型技术整合到小蜜智能问答系统中,不断提升系统的智能化水平和用户体验; 2. 研究并应用最新的多模态理解技术,如图像识别、自然语言处理和语音识别等,以实现系统对各种类型输入的处理能力,或将图像生成能力用于离线知识生产和实时问答; 3. 深入挖掘商品的图片、视频、文本介绍等各个模态中包含的有价值信息,理解并提炼商品问答知识; 4. 与数据科学团队合作,设计和实施模型训练策略,针对特定领域,进行多模态模型prompt设计和调优; 5. 深入跟踪调研多模态/NLP/CV等方向的前沿技术相关内容,包括文生图、图生文等。
更新于 2025-08-18