字节跳动数据仓库开发(高级)工程师
社招全职JTY31地点:北京状态:招聘
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Apache Storm+
[英文] Tutorial
https://storm.apache.org/releases/2.6.0/Tutorial.html
In this tutorial, you'll learn how to create Storm topologies and deploy them to a Storm cluster.
https://www.baeldung.com/apache-storm
This tutorial will be an introduction to Apache Storm, a distributed real-time computation system.
相关职位
社招1年以上A92812
1、负责剪映业务数据体系的规划设计和建设,通过数据产品和数据服务等方式,实现数据驱动业务; 2、负责剪映业务的离线数仓、实时数仓、数据服务化的设计、开发、性能优化,为上层分析和挖掘提供可靠、统一的离线和实时数据服务; 3、负责剪映数据分析平台建设,面向产品、运营、分析师等提供体验良好的万亿规模的交互式、可视化分析工作台; 4、负责离线、实时的ETL工作,为业务提供定制化的数据支持,并优化计算任务性能。
更新于 2025-06-17
社招3年以上IDG
-设计并构建高效的自动驾驶数据仓库架构,包括数据模型设计、ETL过程开发、数据服务构建等 -设计实现高效的数据存储引擎,确保PB级大规模数据的高效存储,快速加载,并确保数据的准确性和一致性 -负责输出不同业务模型训练、评测数据高效存取方案,有效支持自动驾驶模型训练 -与数据科学家和数据分析师合作,根据业务需求进行数据仓库的扩展和定制 -负责推动各类自动驾驶业务流接入数据仓库,帮助业务提升效率
更新于 2024-11-11
社招3年以上核心本地商业-基
1. 负责公司级用户标识体系的设计、开发和维护,确保用户标识体系能力符合业务发展的需求。 2. 负责公司级用户标识关联数据的迭代、优化和维护,满足业务的各类需求。 3. 负责用户标识策略的数据开发工作,验证策略有效性,推动策略上线。
更新于 2025-06-22