滴滴资深数据研发工程师(J250609029)
社招全职5-7年技术地点:北京状态:招聘
任职要求
1.计算机或相关专业本科及以上学历,具备5~7年互联网数据建设工作经验者优先; 2.熟悉数据仓库体系架构、熟悉常见的数据建模方法与数据治理等相关知识;具备较强的业务理解和抽象能力;具备数据域的数据架构及模型建设能力; 3.熟悉主流的大数据处理、实时处理技术,如Hadoop、Hdfs、Spark、StarRocks、Flink等; 4.熟悉数据治理相关工作内容,在数据稳定性方面有实践经验; 5.具备较强的编程能力和编程经验,至少熟悉Java/Python一门编程语言,熟悉linux系统,熟练使用Shell; 6.有用户画像、特征工程、机器学习背景优先,有安全业务背景优先; 7.具备快速学习能力、沟通协调能力及团队精神,有较强的责任心和学习积极性。
工作职责
1. 负责业务安全数据域全链路建设、数据分层框架搭建 2. 负责安全离线特征、实时特征开发;为安全风控策略提供快速稳定的数据服务 3. 负责安全在线及离线数据体系的规划、设计及落地;为安全风控策略提供高效的数据支持
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
Bash+
[英文] The Bash Guide
https://guide.bash.academy/
A quality-driven guide through the shell's many features.
https://www.youtube.com/watch?v=tK9Oc6AEnR4
Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.
特征工程+
https://www.ibm.com/think/topics/feature-engineering
Feature engineering preprocesses raw data into a machine-readable format. It optimizes ML model performance by transforming and selecting relevant features.
https://www.kaggle.com/learn/feature-engineering
Better features make better models. Discover how to get the most out of your data.
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
相关职位
社招技术
1、参与滴滴网约车业务数据建设,负责某一复杂业务方向的数据建设; 2、能够深入了解负责方向业务特点,结合数仓建模理论,进行具体的模型抽象、数据架构设计,在模型的稳定性、可扩展性、成本等角度做到一定的平衡; 3、负责数据仓库ETL流程的优化及解决相关技术问题,对技术有沉淀有思考,对初级同学有一定的技术指导; 4、通过业务规划,理解业务特点,并以此通过数据建设为业务赋能,创造业务价值;
更新于 2025-06-09
社招技术
1、参与滴滴网约车业务数据建设,负责某一业务子方向的数据开发工作; 2、能够深入了解负责方向业务特点,结合数仓建模理论,进行具体的模型抽象与设计; 3、数据仓库ETL流程的优化及解决相关技术问题,在稳定性、扩展性、成本等角度有自己的思考与实践; 4、通过深入理解业务特点,通过数据建设为业务赋能,创造业务价值;
更新于 2025-06-09
社招5-7年技术
1.负责滴滴国际化出行业务方向数据域全链路建设; 2.负责数据仓库ETL流程的优化及解决相关技术问题; 3.负责滴滴核心业务数据建模以及cube数据开发工作;
更新于 2025-07-22