字节跳动后端高级工程师-OLAP分析型数据库
社招全职3PHL地点:北京状态:招聘
任职要求
1、计算机相关专业,本科及以上学历; 2、熟练掌握Golang, Java 或 Python 中的 2-3 种编程语言; 3、具备丰富的构建并维护大规模企业级后台系统的软件工程经验; 4、有Kubernetes容器开发、云上服务建设经验最佳; 5、熟悉大数据生态环境,了解Hadoop,Hive,Kafka,Spark,Druid等大数据技术栈;分析型数据库(ClickHouse/Doris等)经验尤佳。
工作职责
我们致力于打造一款PB级的数据库+数据分析产品,为企业客户的数据驱动型决策助力。该产品也为字节跳动内部的数据处理与决策提供支持。 职位描述: 1、负责字节跳动新一代分析型数仓产品的接入服务开发; 2、参与设计并实现万台规模分布式系统的自动化优化诊断系统; 3、参与内部服务的云上建设,支持Tob产品代码开发; 4、与团队协作,搭建稳定且易用的企业级产品;
包括英文材料
学历+
Go+
https://www.youtube.com/watch?v=8uiZC0l4Ajw
学习Golang的完整教程!从开始到结束不到一个小时,包括如何在Go中构建API的完整演示。没有多余的内容,只有你需要知道的知识。
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
相关职位
社招A27915
1、负责字节跳动新一代分析型数仓产品的接入服务开发; 2、参与设计并实现万台规模分布式系统的后端数据管理; 3、参与内部服务的云上建设,支持ToB产品代码开发; 4、与团队协作,搭建稳定且易用的企业级产品。
更新于 2023-08-17
社招A180587
1、负责字节跳动新一代分析型数仓产品的接入服务开发; 2、参与设计并实现万台规模分布式系统的后端数据管理; 3、参与内部服务的云上建设,支持ToB产品代码开发; 4、与团队协作,搭建稳定且易用的企业级产品。
更新于 2025-05-08
社招2年以上IDG
-负责自动驾驶数据全链路数据流构建,完成数据闭环下的AI Infra系统构建,包含挖掘相关框架、标注、数据管道、分布式训练、模型评测、推理和部署全链路的建设 -负责数据闭环系统稳定性和效率建设,对数据飞轮落地有想法和落地路径 -负责面向AI Infra 的数据存储系统建设,了解高吞吐下的Infra 框架设计 -负责面向大模型时代的工程架构优化工作,对各业务进行对接和合理化评估,针对现行系统中各种问题进行分析优化,给出设计优化方案并实现
更新于 2025-10-14