拼多多算法工程师(NLP方向)
社招全职3年以上技术类地点:上海状态:招聘
任职要求
1.计算机相关专业,本科以上学历,3年以上互联网行业经验,有电商/NLP相关算法背景优先; 2.扎实的机器学习/NLP理论基础,数学功底扎实,有良好的编程能力和建模思维; 3.熟悉至少一种深度学习工具框架(如Tensorflow/Pytorch/Mxnet/Keras等); 4.熟悉常见的大数据开发平台和工具(如Hadoop/Hive/Spark等); 5.技术能力全面,自我驱动、结果导向,有强烈的责任心。
工作职责
1.负责NLP技术在商品同款/相似款、商品SPU抽取相关算法中的应用和拓展; 2.负责分析、挖掘电商场景中的多种文本数据,包括不限于商品的标题/描述/属性、仓配UGC内容等,构建供应链知识体系; 3.负责内部NLP基础能力的建设和维护,包括但不限于分词、实体识别、知识抽取、语义理解等。
包括英文材料
学历+
NLP+
https://www.youtube.com/watch?v=fNxaJsNG3-s&list=PLQY2H8rRoyvzDbLUZkbudP-MFQZwNmU4S
Welcome to Zero to Hero for Natural Language Processing using TensorFlow!
https://www.youtube.com/watch?v=R-AG4-qZs1A&list=PLeo1K3hjS3uuvuAXhYjV2lMEShq2UYSwX
Natural Language Processing tutorial for beginners series in Python.
https://www.youtube.com/watch?v=rmVRLeJRkl4&list=PLoROMvodv4rMFqRtEuo6SGjY4XbRIVRd4
The foundations of the effective modern methods for deep learning applied to NLP.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
MXNet+
https://www.tutorialspoint.com/apache_mxnet/index.htm
Apache MXNet is a powerful deep learning framework that supports both symbolic and imperative programming.
Keras+
https://keras.io/getting_started/intro_to_keras_for_engineers/
Keras 3 is a deep learning framework works with TensorFlow, JAX, and PyTorch interchangeably.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
相关职位
社招QQ AI技术
1.负责大语言模型在产品应用上的算法优化及实现工作; 2.调研大模型的业界前沿算法,追踪最前沿的技术动态,并应用在相关的项目中; 3.参与项目讨论,基于技术判断和技术优化对产品应用提出改进建议。
更新于 2025-05-28
实习
1、负责大模型相关算法研究、重难点技术攻关,应用场景包括面向手机场景的文本、多模态内容生成和理解任务; 2、跟进业界最新的文本、多模态预训练、强化学习、推理增强模型相关技术和方法,根据公司内重点业务场景的需求,研发行业领先的原创性算法; 3、顶会论文发表,对外技术分享,提高团队整体技术影响力。
更新于 2025-07-14
社招1-3年大模型
1、负责内容安全相关的自然语言处理核心算法的研究与开发,如文本分类、情感分析、长文本语义理解、舆情分析等,构建并优化NLP模型,提升模型性能与快速对抗变异风险能力,对于涉z、色情、违规等内容进行全方位的识别,构建业界领先的内容识别能力; 2、跟踪NLP领域前沿技术与研究成果,探索新技术在实际业务中的应用,如大模型微调、加速等,针对不同业务形态,提出创新性的NLP解决方案; 3、与业务部门紧密合作,了解业务需求,推动内容安全解决方案在公司各应用场景的落地。
更新于 2025-08-23