滴滴高级数据研发工程师(J250715016)
社招全职3-5年技术地点:北京状态:招聘
任职要求
1.深入理解常用的数据建模理论,可独立把控数据仓库各层级的设计; 2.熟悉Hadoop生态,精通Hdfs、Hive,熟悉Spark、Presto,有spark任务调优经验; 3.了解数据治理,从事过治理相关工作、理解数据治理的重要性; 4.掌握ES/Druid/StarRocks/ClickHouse 等OLAP引擎一种以上; 5.具备较强的编程能力和编程经验,逻辑严密,细致有耐心; 6.具备一定的数据分析能力,具备数据敏感性和探知欲,专注数据的价值发现和转化; 7.具备快速学习能力、沟通协调能力及团队精神,有较强的责任心和学习积极性; 8.对新技术如数据湖、湖仓一体、流批一体等技术有一定了解优先; 9.必须是本科以及以上、计算机,数学,财经相关专业、学历; 10.3-5年互联网工作经验、有财务相关知识优先;
工作职责
1.参与滴滴各业务线财务主题数据基础建设工作; 2.参与支持各业务线财报,管报数据建设和结账工作; 3.参与支持集团审计,税务等工作;
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Presto+
[英文] What is Presto?
https://prestodb.io/what-is-presto/
https://www.tutorialspoint.com/apache_presto/index.htm
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
ElasticSearch+
https://www.youtube.com/watch?v=a4HBKEda_F8
Learn about Elasticsearch with this comprehensive course designed for beginners, featuring both theoretical concepts and hands-on applications using Python (though applicable to any programming language). The course is structured in two parts: first covering essential Elasticsearch fundamentals including index management, document storage, text analysis, pipeline creation, search functionality, and advanced features like semantic search and embeddings; followed by a practical section where you'll build a real-world website using Elasticsearch as a search engine, working with the Astronomy Picture of the Day (APOD) dataset to implement features such as data cleaning pipelines, tokenization, pagination, and aggregations.
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
学历+
相关职位
社招技术
岗位职责: 1、参与滴滴网约车业务数据建设,负责某一业务子方向的数据开发工作; 2、能够深入了解负责方向业务特点,结合数仓建模理论,进行具体的模型抽象与设计; 3、数据仓库ETL流程的优化及解决相关技术问题,在稳定性、扩展性、成本等角度有自己的思考与实践; 4、通过深入理解业务特点,通过数据建设为业务赋能,创造业务价值;
更新于 2025-09-03
社招技术
【岗位职责】 1、负责参与设计、构建支撑滴滴货运业务的数据仓库; 2、负责抽象货运核心业务流程,沉淀业务通用分析框架,开发数据应用产品; 3、负责不断完善数仓数据治理体系,能持续完善数仓建设的质量、效率、成本;
更新于 2025-08-05