滴滴国际事业群IBG-大数据研发实习生
实习兼职技术类地点:北京状态:招聘
任职要求
1、本科及以上学历,计算机或相关专业,具备互联网数仓实习工作经验者优先; 2、熟悉Hadoop生态,精通Hadoop、Spark、Flink,熟悉StarRocks/Clickhouse/Doris至少一种OLAP引擎,有任务调优经验者优先; 3、熟悉Jav…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、参与滴滴国际化数据采集、ETL、建模、开发、上线全链路数据建设工作; 2、参与数据仓库ETL流程的优化及解决相关技术问题。
包括英文材料
学历+
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
还有更多 •••
相关职位
实习技术类
1.参与滴滴国际化数据体系建设 2.参与数据采集、ETL、建模、开发、分析等标准化搭建 3.参与数据体系化流程化治理 4.参与业务问题拆解、转化、沟通与解决赋能
更新于 2026-03-30北京
实习技术类
1.参与滴滴国际化数据体系建设 2.参与数据采集、ETL、建模、开发、分析等标准化搭建 3.参与数据体系化流程化治理 4.参与业务问题拆解、转化、沟通与解决赋能
更新于 2025-08-18北京
实习技术类
1. 使用NLP、CV、多模态大模型等技术分析评论、投诉等反馈数据,构建反馈数据的标签体系。 2. 基于大模型微调,深度理解多模态内容,实现用户反馈理解垂直大模型。 3. 基于反馈标签,构建反馈指标体系,深度挖掘反馈标签的应用价值。 4. 需要基于大模型的AI Agent全生命周期研发,包括通用型及垂直领域AI Agent的应用架构设计、数据构建、模型训练与评测 5. 在作弊风险研判,样本辅助标注等场景应用大模型帮助定性,与上下游协作优化应用效率
更新于 2025-12-25北京