滴滴数据研发专家(J250304011)
社招全职技术地点:北京状态:招聘
任职要求
1、PB级流量数仓经验,掌握数据仓库体系架构、建模方法、数据治理知识; 2、熟悉数据分析方法,了解流量分发业务玩法,对流量价值探索充满热情; 3、较强的业务理解抽象能力,良好的沟通协作能力,能快速定位解决问题; 4、具有技术规划能力,较强自驱力和责任感,面对复杂问题能攻关拿结果; 5、技术栈:包括不限于Hadoop/Hive/Spark/Flink/OLAP/SQL/Java/Scala等。
工作职责
1、负责设计构建支撑滴滴用户增长和流量分发的滴滴全域用户增长数仓; 2、负责抽象核心业务流程,沉淀业务通用分析框架,开发数据应用产品; 3、负责不断完善数仓数据治理体系,能持续提升数仓建设的质量和效率;
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
相关职位
社招技术
1.参与滴滴内部 post-training 框架研发,聚焦 LLM + RL 方向,设计框架架构与技术路线,提升其扩展性、稳定性与效率。 2.优化框架性能,如训练速度、显存占用等,降低训练成本,为 LLM + RL 训练提供有力技术支撑。 3.协同业务团队,将 LLM 能力在业务场景落地,根据业务需求定制训练方案并评估验证模型。 4.关注行业前沿,引入有价值的技术到公司框架和模型中,探索新算法与方法,推动技术创新。
更新于 2025-06-16
社招5年以上D0599
1、负责或参与快手电商的离线/实时数据架构设计、模型设计; 2、完成海量数据的离线/实时加工处理工作,主导或参与团队大数据开发流程的优化以及相关技术问题的解决; 3、基于应用场景进行数据模型抽象,支持数据产品建设,助力业务发展。
更新于 2024-10-09
社招3-5年数据
1.参与滴滴国际金融支付、信贷、信用卡离线、实时数据集市和实时指标开发工作; 2.参与滴滴国际金融部门离线、实时相关数据规划、设计以及落地; 3.参与风控实时数据计算和服务的性能优化与运维,为业务提供稳定的服务; 4.参与实时标签数据的计算,与算法、标签部门密切合作;
更新于 2025-08-29