滴滴高级数据研发工程师(J250616020)
社招全职5年以上技术地点:北京状态:招聘
任职要求
1. 计算机、数学等相关专业本科及以上学历; 2. 五年以上数据开发工作经验,深入理解常用的数据建模理论,可独立把控数据仓库各层级的设计; 3. 熟悉Hadoop生态,精通Hdfs、Hive、MR开发,熟悉Spark、Presto、Hbase,有任务调优经验、实时开发经验、数据治理经验; 4. 具备较强的编程能力和编程经验,至少熟悉Java/Python/Scala一门编程语言; 5. 具备复杂业务的需求梳理能力,较强的结构化思维能力和问题分析能力,良好的沟通能力及团队协作精神; 6.具有技术规划能力,较强自驱力和责任感,面对复杂问题能攻关拿结果。
工作职责
1. 负责滴滴核心业务的数据仓库搭建及开发, 进行完整的数仓建模并持续优化,包括数据生产、数据加工、数据应用及治理; 2. 负责抽象核心业务流程,沉淀业务通用分析框架,开发数仓中间层和数据应用产品; 3. 负责数据开发的流程与代码的规范性及优化,不断完善数据治理体系,持续提升数仓建设的质量和效率。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
MapReduce+
https://www.youtube.com/watch?v=bcjSe0xCHbE
https://www.youtube.com/watch?v=cHGaQz0E7AU
In this video I explain the basics of Map Reduce model, an important concept for any software engineer to be aware of.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Presto+
[英文] What is Presto?
https://prestodb.io/what-is-presto/
https://www.tutorialspoint.com/apache_presto/index.htm
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Scala+
相关职位
社招1年以上技术
负责滴滴国际化搜索引擎研发,包括: 1、参与滴滴极具创新的搜索系统技术研究,挑战智能搜索领域的世界级问题。挖掘大规模地理信息数据的价值,推进NLP技术在智慧地图中的应用,领衔地理信息技术,创造极致出行体验。 2、负责用深度学习重新定义地图Query语义分析-召回架构,优化用户Query分析改写引擎,改进召回效果和效率,解决复杂Query语义理解和召回问题。 3、参与创新性技术研究,利用大模型、大规模地理数据改造传统搜索技术,推进AI技术发展。
更新于 2025-06-16
社招技术
1、参与滴滴网约车业务数据建设,负责某一业务子方向的数据开发工作; 2、能够深入了解负责方向业务特点,结合数仓建模理论,进行具体的模型抽象与设计; 3、数据仓库ETL流程的优化及解决相关技术问题,在稳定性、扩展性、成本等角度有自己的思考与实践; 4、通过深入理解业务特点,通过数据建设为业务赋能,创造业务价值;
更新于 2025-06-09
社招2年以上技术
1. 参与负责滴滴酒店等出游场景的业务模块架构设计、业务演进 2. 充分理解业务和平台设计,做出前瞻性的系统抽象,能够推进系统演进 3. 具备技术攻关能力,对系统稳定性与性能进行持续优化
更新于 2025-06-12