滴滴资深数据研发工程师(J251021024)
社招全职3-5年技术地点:北京状态:招聘
任职要求
1.深入理解常用的数据建模理论,可独立把控数据仓库各层级的设计; 2.熟悉Hadoop生态,精通Hdfs、Hive、MR开发,熟悉Spark、Presto,有任务调优经验; 3.了解数据治理,从事过治理相关工作、理解数据治理的重要性; 4.扎实的大数据和分布式经验,如Flink、kafka、spark等流式大数据计算及运维经验,熟悉flink优先; 5.掌握ES/Druid/StarRocks/ClickHouse 等…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1.参与滴滴国际金融现金贷离线、实时数据集市和实时指标开发工作; 2.参与滴滴国际金融现金贷离线、实时相关数据规划、设计以及落地; 3.参与风控实时数据计算和服务的性能优化与运维,为业务提供稳定的服务; 4.参与实时标签数据的计算,与算法、标签部门密切合作;
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
MapReduce+
https://www.youtube.com/watch?v=bcjSe0xCHbE
https://www.youtube.com/watch?v=cHGaQz0E7AU
In this video I explain the basics of Map Reduce model, an important concept for any software engineer to be aware of.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Presto+
[英文] What is Presto?
https://prestodb.io/what-is-presto/
https://www.tutorialspoint.com/apache_presto/index.htm
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
还有更多 •••
相关职位
社招5-7年技术
北京市昌平区 1.负责滴滴国际化出行业务方向数据域全链路建设; 2.负责数据仓库ETL流程的优化及解决相关技术问题; 3.负责滴滴核心业务数据建模以及cube数据开发工作;
更新于 2025-11-05北京
社招5-7年技术
1. 负责业务安全数据域全链路建设、数据分层框架搭建 2. 负责安全离线特征、实时特征开发;为安全风控策略提供快速稳定的数据服务 3. 负责安全在线及离线数据体系的规划、设计及落地;为安全风控策略提供高效的数据支持
更新于 2025-06-20北京
社招5-7年技术
1.负责滴滴国际化出行业务方向数据域全链路建设; 2.负责数据仓库ETL流程的优化及解决相关技术问题; 3.负责滴滴核心业务数据建模以及cube数据开发工作;
更新于 2025-07-22北京