快手数据挖掘/算法工程师(核心资产)-【数据平台】
社招全职3年以上D6225地点:北京状态:招聘
任职要求
1、具备机器学习或者数据挖掘的研究和项目背景;熟练掌握分类、回归、聚类等机器学习模型,能够把业务问题拆解成适合的数据、算法问题,并完成价值落地; 2、扎实的编程基础,精通至少一门编程语言; 有大数据计算、分布式算法开发经验; 3、好奇心,有良好的的数据和业务敏感度,对数据驱动业务有极大的兴趣。 4、本科及以上学历,3年以上数据挖掘、机器学习、大规模数据分析相关经验; 5、有较强的数理统计和挖掘算法功底,灵活使用Python/SQL; 6、熟悉Hadoop、Hive、Spark,对数据仓库、特征工程有正确的认识。
工作职责
1、整合海量多维数据,进行全站数据挖掘,构建用户画像体系、时空知识体系,并搭建全站核心数据资产管理平台; 2、深入业务场景,利用全站海量多维数据,综合运用统计和数据挖掘/机器学习的方法,探索平台新的业务增长点,为各类业务系统提供特征和模型支撑; 3、深度参与归因分析、异常检测、知识图谱等专题类建设工作。
包括英文材料
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
数据挖掘+
https://www.youtube.com/watch?v=-bSkREem8dM
Database vs Data Warehouse vs Data Lake
https://www.youtube.com/watch?v=7rs0i-9nOjo
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
学历+
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
特征工程+
https://www.ibm.com/think/topics/feature-engineering
Feature engineering preprocesses raw data into a machine-readable format. It optimizes ML model performance by transforming and selecting relevant features.
https://www.kaggle.com/learn/feature-engineering
Better features make better models. Discover how to get the most out of your data.
相关职位
社招1年以上D6255
1、整合海量多维数据,进行全站数据挖掘,构建用户画像体系、时空知识体系,并搭建全站核心数据资产管理平台; 2、深入业务场景,利用全站海量多维数据,综合运用统计和数据挖掘/机器学习的方法,探索平台新的业务增长点,为各类业务系统提供特征和模型支撑; 3、深度参与归因分析、异常检测、知识图谱等专题类建设工作。
更新于 2024-07-15
社招1年以上D6225
1、整合快手全域海量异构数据,建设公司级核心资产,包括但不限于统一ID服务、时空资产和用户画像; 2、参与核心资产研发体系建设,比如架构设计、数仓建设和数据治理; 3、技术攻坚,解决海量数据下的复杂技术问题,比如关系挖掘、图挖掘中的工程问题; 4、深入业务场景,了解业务痛点,为各业务线提供数据驱动的解决方案。
更新于 2024-08-28
社招5年以上技术类-数据
1、加入高德地图的商业智能BI团队,深入理解高德核心业务,为决策层评估业务价值、进行业务决策等提供数据支撑; 2、数据研发:参与高德地图打车、搜索等核心业务的数据仓库、数据产品建设,参与数据治理并沉淀业务数据资产; 3、数据挖掘:基于高德地图的海量日志,通过算法模型挖掘有价值的业务信息,指导高德地图的产品迭代。
更新于 2025-07-31