迅雷数仓开发实习生
实习兼职DBA地点:深圳状态:招聘
任职要求
1、本科及以上学历,27届及以后,计算机或相关专业,具备互联网数仓实习工作经验者优先; 2、熟悉Hadoop生态,精通Hadoop、Spark、Flink,熟悉StarRocks/Clickhouse/Doris至少一种OLAP引擎,有任务调优经验者优先; 3、熟…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
【需尽快到岗】 1.参与数仓的开发与维护工作 2.协助完成数据集成、清洗和转换任务 3.支持团队在数据分析和业务决策方面的努力
包括英文材料
学历+
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
还有更多 •••
相关职位

实习
1.负责公司增长、商业化业务线的业务数据需求承接,参与数据采集、ETL、数仓建模,构建核心数据资产; 2.参与数据治理,建设科学统一的数据质量评估体系,制定有效的治理策略并落地,保障数据的健康度; 3.基于业务链路,参与数据指标体系建设,推动数据指标体系的发展和完善,主动监控指标表现,为业务的运营决策提供数据支持;
更新于 2024-09-14上海

实习
1.负责公司增长、商业化业务线的业务数据需求承接,参与数据采集、ETL、数仓建模,构建核心数据资产; 2.参与数据治理,建设科学统一的数据质量评估体系,制定有效的治理策略并落地,保障数据的健康度; 3.基于业务链路,参与数据指标体系建设,推动数据指标体系的发展和完善,主动监控指标表现,为业务的运营决策提供数据支持;
更新于 2025-06-24上海

实习
1.负责公司增长、商业化业务线的业务数据需求承接,参与数据采集、ETL、数仓建模,构建核心数据资产; 2.参与数据治理,建设科学统一的数据质量评估体系,制定有效的治理策略并落地,保障数据的健康度; 3.基于业务链路,参与数据指标体系建设,推动数据指标体系的发展和完善,主动监控指标表现,为业务的运营决策提供数据支持;
更新于 2025-06-24北京