滴滴高级数据研发工程师(J250803001)
社招全职2年以上数据地点:北京状态:招聘
任职要求
1.深入理解常用的数据建模理论,可独立把控数据仓库各层级的设计; 2.熟悉Hadoop生态,精通Hdfs、Hive、MR开发,熟悉Spark、Presto,有任务调优经验; 3.了解数据治理,从事过治理相关工作、理解数据治理的重要性; 4.扎实的大数据和分布式经验,如Flink、kafka、spark等流式大数据计算及运维经验,熟悉flink优先; 5.掌握ES/Druid/StarRocks/ClickHouse 等OLAP引擎一种以上; 6.具备较强的编程能力和编程经验,至少熟悉Java/Python/Scala一门编程语言; 7.具备一定的数据分析能力,具备数据敏感性和探知欲,专注数据的价值发现和转化; 8.具备快速学习能力、沟通协调能力及团队精神,有较强的责任心和学习积极性; 9.对新技术如数据湖、湖仓一体、流批一体等技术有一定了解优先; 10.本科以及以上、计算机相关专业,2年以上互联网工作经验、英语口语能力优先;
工作职责
1.参与滴滴国际化平台数据的离线、实时数据集市和实时数据的开发工作; 2.参与滴滴国际化外卖离线、实时相关数据规划、设计以及落地; 3.参与风控实时数据计算和服务的性能优化与运维,为业务提供稳定的服务;
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
MapReduce+
https://www.youtube.com/watch?v=bcjSe0xCHbE
https://www.youtube.com/watch?v=cHGaQz0E7AU
In this video I explain the basics of Map Reduce model, an important concept for any software engineer to be aware of.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Presto+
[英文] What is Presto?
https://prestodb.io/what-is-presto/
https://www.tutorialspoint.com/apache_presto/index.htm
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
ElasticSearch+
https://www.youtube.com/watch?v=a4HBKEda_F8
Learn about Elasticsearch with this comprehensive course designed for beginners, featuring both theoretical concepts and hands-on applications using Python (though applicable to any programming language). The course is structured in two parts: first covering essential Elasticsearch fundamentals including index management, document storage, text analysis, pipeline creation, search functionality, and advanced features like semantic search and embeddings; followed by a practical section where you'll build a real-world website using Elasticsearch as a search engine, working with the Astronomy Picture of the Day (APOD) dataset to implement features such as data cleaning pipelines, tokenization, pagination, and aggregations.
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Scala+
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
英语口语+
https://www.youtube.com/@SpeakEnglishWithVanessa
Speak English naturally, confidently, and fluently with Vanessa.
相关职位
社招技术
1、负责设计、构建支撑滴滴全集团用户体验、客服的滴滴用户体验数仓; 2、负责抽象核心业务流程,沉淀业务通用分析框架,开发数据应用产品; 3、负责不断完善数仓数据治理体系,能持续提升数仓建设的质量和效率。
更新于 2025-06-04
社招技术
1、负责设计、构建支撑滴滴全集团用户体验、客服的滴滴用户体验数仓; 2、负责抽象核心业务流程,沉淀业务通用分析框架,开发数据应用产品; 3、负责不断完善数仓数据治理体系,能持续提升数仓建设的质量和效率。
更新于 2025-08-06
社招3-5年技术
1. 负责外卖治理技术核心服务的日常开发与维护,保障治理业务的高效、高质量落地 2. 充分理解治理业务,可以通过业务模型拆解技术架构,可以识别现有系统问题并能给出系统优化方案 3. 参与外卖、闪送等业务线的治理业务需求,牵头中等规模的业务需求落地
更新于 2025-08-08
社招2年以上技术
滴滴国际化历经多年探索,业务取得了稳定增长,是中国互联网公司出海为数不多的亮点,也是公司关键战略。 “没有安全,一切归零”,滴滴正在持续不断提升产品使用过程中的安全体验,国际化安全工程团队为实现该目标提供了坚实的技术保障。 在这里,你将参与到国际化业务安全功能的研发工作,为多个国家和地区(拉美、南美、澳洲、新西兰、日本)的出行、外卖等业务提供高可用的安全服务; 您将与我们一起挑战不同国家个性化的安全需求的实现,提供支持快速开国开城的研发工具、运营工具; 参与高并发、大数据量系统的开发和迭代,与团队一起不断提升系统性能和可靠性; 识别代码或系统已有问题,主动进行优化;积极尝试新技术新方案,拓展技术视野;
更新于 2025-08-04