字节跳动资深数据仓库工程师-集团信息系统
社招全职3年以上A61080地点:北京状态:招聘
任职要求
1、熟悉数据仓库体系架构、数据建模方法、数据治理等知识,有较强的SQL/ETL开发能力; 2、对数据价值探索感兴趣,较强的业务理解和抽象能力,能快速分析和理解问题; 3、掌握大数据技术栈,包括Hadoop/Hive/Spark/Flink/OLAP引擎等; 4、具备优秀的逻辑思维能力,对解决挑战性问题充满热情,善于分析解决问题; 5、扎实的数据结构、数据库原理等基础知识,理工科本科以上学历,3年以上数据仓库建模经验; 6、具备良好的沟通技能、团队合作能力,责任心强;有财务、金融等行业应用经验者优先。
工作职责
1、对业务问题进行合理抽象和设计,设计和开发高质量的底层数据体系,驱动业务快速健康发展; 2、负责数据模型的架构设计、开发以及海量数据下的性能调优、复杂业务场景下的需求交付; 3、参与构建围绕数据安全、质量、效率、成本等方向的数据管理能力建设,在横向场景落地; 4、深入业务,理解并合理抽象业务需求,发挥数据价值,与业务团队紧密合作; 5、参与数据平台架构设计,核心模块任务开发工作。
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
学历+
相关职位
社招A116960
1、对业务问题进行合理抽象和设计,设计和开发高质量的底层数据体系,驱动业务快速健康发展; 2、在数据仓库内实施收集,清洗和规约等工作; 3、提供面向业务的数据服务,完成数据指标的统计,多维分析和展现; 4、根据业务和产品情况,抽象业务逻辑,搭建和开发大数据平台; 5、参与数据平台架构设计,核心开发任务。
更新于 2023-06-28
社招5-10年网易职能
1、负责网易集团财经数据中台的数仓规划与设计 2、完成相关原始数据采集、清洗、整理、去重和治理,保证数据及时性、完整性、一致性和准确性。 3、参与业务需求调研,根据业务需求设计数据仓库维度模型,并完成数据模型开发,沉淀数据指标。 4、持续改进优化ETL、分析处理等问题,对结构化的数据做数据分析; 5、对项目开发进度、代码质量进行管控、完成技术文档的沉淀。
更新于 2025-10-10

社招8年以上计算机网络技术类
1、具备大型集团企业级数据仓库建设经验,负责企业E-HR数据仓库的开发工作 2、规划和设计可靠性及稳定性高的大数据产品,保证数据的准确性和时效性; 3、负责离线、实时任务的调度管理和优化方案,确保任务的稳定高效运行; 4、参与设计数据架构,数据建模,搭建和维护离线数据仓库模型; 5、提供面向业务的数据服务,完成数据指标的加工,多维分析和展现 6、具备强大的沟通协调能力及敏锐的数据洞察力,能够基于业务需求,精准选取分析维度并提取关键数据
更新于 2025-10-15