字节跳动数据挖掘算法工程师
社招全职2年以上G341地点:北京状态:招聘
任职要求
1、机器学习、数据挖掘相关方向硕士及以上学历或 2 年以上工作经验,对用户画像分层/推荐系统有经验者优先考虑; 2、熟悉C/C++语言、Python、Java任意一种语言,较强的算法和数据结构功底,熟悉大规模数据挖掘、机器学习等相关技术,熟悉Hadoop/Spark/Hive技术优先; 3、良好的逻辑思维能力,优秀的分析和解决问题的能力,对挑战性问题充满激情; 4、良好的团队合作精神,较强的沟通能力。
工作职责
1、研究数据挖掘或统计学习领域的前沿技术,针对海量用户行为和内容信息,构建和优化用户画像以及用户属性; 2、基于对用户理解和大量数据特征,参与风控、精准营销、个性化定价等模型建设和领域研究,提升产品效果; 3、根据公司需要寻找和采集相关数据,对原始数据进行清理、甄别、归类和整合,并实现流程自动化。
包括英文材料
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
数据挖掘+
https://www.youtube.com/watch?v=-bSkREem8dM
Database vs Data Warehouse vs Data Lake
https://www.youtube.com/watch?v=7rs0i-9nOjo
学历+
推荐系统+
[英文] Recommender Systems
https://www.d2l.ai/chapter_recommender-systems/index.html
Recommender systems are widely employed in industry and are ubiquitous in our daily lives.
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
相关职位

社招2年以上
1. 负责国际机票智能运营系统的搭建,利用数据科学相关手段解决报价策略、收益管理等供应链核心问题; 2. 负责国际机票供应链核心业务的策略优化,数据驱动改进业务流程,提升总体效率和核心指标; 3. 负责国际机票供应链相关数据的定量分析,洞察数据背后的业务规律和价值,发掘优化方向,探索解决方案.
更新于 2023-03-27

社招2年以上算法工程
1、研发基于VLM/多模态大模型的数据挖掘算法,精准识别自动驾驶长尾场景(如极端天气、复杂交通参与行为、罕见障碍物等)。 2、构建高效的自动化数据挖掘Pipeline,提升数据标签质量并降低标注成本。 3、 结合点云、图像、文本等多模态数据,设计多模态特征,支持数据的跨模态检索
更新于 2025-03-20
校招J1002
1、负责海量短视频生产链路算法优化,基于视频特效、用户画像、行为序列、消费反馈等大规模数据信号进行算法建模,加强特效、美颜等视频生产业务的智能化; 2、通过异常检测、因果推断、自动归因等算法等对生产、消费数据进行挖掘,洞察业务痛点,指导业务优化方向; 3、挖掘热点事件、预测流行趋势,帮助视频特效等业务更好运营和生产; 4、挖掘用户特征,用于提升广告与用户匹配的效率、业务反欺诈、渠道反作弊、搜索索引等业务场景。
更新于 2025-07-30