小红书数据挖掘专家
社招全职5年以上内容理解地点:北京 | 上海状态:招聘
任职要求
1、 具备机器学习或者数据挖掘的研究和内容领域的项目经验;熟练掌握分类、聚类、回归等机器学习模型; 2、 对数据驱动业务有兴趣,善于将业务问题拆解为算法问题,有助力业务价值、用户画像的相关经验者优先; 3、 扎实的编程功底,精通Python、Java至少一门语言;有大数据处理经验、分布式算法开发经验者优先; 4、 研究生及以上学历,五年以内数据挖掘、机器学习、大规模数据分析的经验; 5、熟悉Hadoop、Hive、Spark,对数据仓库、特征工程有正确的认识 6、参与过广告投放、DMP标签建设等工作优先
工作职责
1、 深入业务场景,利用全域数据资产和海量多维数据,挖掘DMP标签和特征体系,提升广告模型推荐效果 2、基于全域数据资产和商业化业务场景,挖掘潜力SPU商品,实现精准投放
包括英文材料
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
数据挖掘+
https://www.youtube.com/watch?v=-bSkREem8dM
Database vs Data Warehouse vs Data Lake
https://www.youtube.com/watch?v=7rs0i-9nOjo
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
学历+
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
特征工程+
https://www.ibm.com/think/topics/feature-engineering
Feature engineering preprocesses raw data into a machine-readable format. It optimizes ML model performance by transforming and selecting relevant features.
https://www.kaggle.com/learn/feature-engineering
Better features make better models. Discover how to get the most out of your data.
相关职位
社招3-5年内容理解
1、 整合海量多维数据进行数据挖掘,面向全、新、净、准、丰的目标构建小红书国内和海外的POI数据资产体系,进行结构化POI库的建设; 2、 利用全域数据资产和海量多维数据,运用机器学习和统计分析的方法,面向小红书开放平台业务挖掘POI父子关系、POI标签体系、用户时空知识体系,为POI各类场景提供模型和服务支撑;
更新于 2025-07-21
社招3-5年内容理解
1、 整合海量多维数据,进行全站数据挖掘,构建用户画像体系、时空知识体系,并参与建设全站核心数据资产管理平台; 2、 深入业务场景,利用全域数据资产和海量多维数据,运用机器学习和统计分析的方法,探索平台新的业务增长点,为各业务系统提供模型和特征支撑;
社招3年以上数据科学
1、研究数据挖掘或统计学习领域的前沿技术,整合全站海量多维数据,进行全站数据挖掘; 2、深入业务场景,利用全域数据资产和海量多维数据,运用机器学习和统计分析的方法,探索平台新的业务增长点; 3、根据公司需要寻找和采集相关数据,对原始数据进行清理、甄别、归类和整合,并实现流程自动化。