字节跳动大数据研发工程师
社招全职A140437地点:北京状态:招聘
任职要求
1、计算机相关专业本科以上学历,熟悉大数据计算架构和工作原理,熟悉Spark/Flink编程;精通Hive,有HQL优化经验; 2、熟悉JAVA,python等多种编程技术,编程能力强,有web服务开发经验,具备独立完成模块开发能力; 3、理解基本的设计模式,能将业务需求快速理解成技术需求; 4、熟练使用Mysql,熟练使用ElasticSearch、ClickHouse者优先;熟悉其原理者优先; 5、善于沟通,工作积极主动,责任心强,具备良好的团队协作能力; 6、具备良好的问题分析与解决能力,有较强学习能力和逻辑思维能力。 额外加分项: 1、Github等开源社区贡献者; 2、具备大规模分布式服务设计能力和经验。
工作职责
1、广告各类在线业务的离线数据加工与在线数据服务开发与维护; 2、数据服务接口及产品需求研发迭代,代码review、bug修复及日常服务运维; 3、针对海量数据处理和查询需求,设计适应业务变化的合理的多维数据分析系统架构,满足多样性的需求; 4、海量日志清洗加工,并抽象出可以多业务复用的数据模型。
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Web+
https://web.dev/learn
Explore our growing collection of courses on key web design and development subjects.
设计模式+
https://liaoxuefeng.com/books/java/design-patterns/index.html
设计模式,即Design Patterns,是指在软件设计中,被反复使用的一种代码设计经验。使用设计模式的目的是为了可重用代码,提高代码的可扩展性和可维护性。
[英文] Design Patterns
https://refactoring.guru/design-patterns
Design patterns are typical solutions to common problems in software design. Each pattern is like a blueprint that you can customize to solve a particular design problem in your code.
https://www.youtube.com/watch?v=NU_1StN5Tkk
Design Patterns tutorial explained in simple words using real-world examples.
MySQL+
https://juejin.cn/post/7190306988939542585
这是一篇 MySQL 通关一篇过硬核经验学习路线,包括数据库相关知识,SQL语句的使用,数据库约束,设计等。
[英文] MySQL Tutorial
https://www.mysqltutorial.org/
your go-to resource for mastering MySQL in a fast, easy, and enjoyable way.
https://www.youtube.com/watch?v=5OdVJbNCSso
MySQL SQL tutorial for beginners
https://www.youtube.com/watch?v=7S_tz1z_5bA
This beginner-friendly course teaches you SQL from scratch.
ElasticSearch+
https://www.youtube.com/watch?v=a4HBKEda_F8
Learn about Elasticsearch with this comprehensive course designed for beginners, featuring both theoretical concepts and hands-on applications using Python (though applicable to any programming language). The course is structured in two parts: first covering essential Elasticsearch fundamentals including index management, document storage, text analysis, pipeline creation, search functionality, and advanced features like semantic search and embeddings; followed by a practical section where you'll build a real-world website using Elasticsearch as a search engine, working with the Astronomy Picture of the Day (APOD) dataset to implement features such as data cleaning pipelines, tokenization, pagination, and aggregations.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
相关职位
社招3年以上J6NQP
1、负责抖音/抖音火山版等多个业务线的策略算法建设与优化工作; 2、通过海量数据,分析与挖掘各种潜在关联,不断优化策略效果,保障用户体验; 3、负责实时及离线特征抽取、融合,为数据挖掘及策略平台提供特征服务; 4、负责大数据能力在产品功能上的落地,推动产品数据化和智能化。
更新于 2021-01-19
社招A177865
部门介绍:成为字节跳动广告收入增长的驱动力之一。成为以数据为中心的技术的先驱,并构建可衡量的高质量数据、服务和产品。非中国数据负责广告日志、广告数据仓库、数据中心仪表板中心、广告商数据服务等。 1、广告各类在线业务的离线数据加工与在线数据服务开发与维护; 2、数据服务接口及产品需求研发迭代,代码review、bug修复及日常服务运维; 3、针对海量数据处理和查询需求,设计适应业务变化的合理的多维数据分析系统架构,满足多样性的需求; 4、海量日志清洗加工,并抽象出可以多业务复用的数据模型。
更新于 2024-01-22
社招3年以上A136495
UBA(User Behavior Analysis) 是一个内部分析平台,在全球有超过 10000 名员工大量使用。这是一款基于用户行为数据的数据中台产品,为字节跳动/抖音内部的国内外各种业务提供数据分析服务,包括抖音、今日头条、西瓜、抖音等。它具有强大的功能,支持 EB 级的海量数据、万亿级的事件量和毫秒级的响应时间,为用户提供简单、灵活和高性能的数据分析服务。 1、参与多区域机房部署和维护:在多个地点部署和维护大数据平台的经验,熟悉跨地域协作挑战; 2、参与研发团队迭代开发,开发和指导软件测试和验证程序; 3、负责大数据平台的架构设计、性能调优和故障排除; 4、分析和解决复杂的系统性能和稳定性问题,确保系统的可靠性和稳定性; 5、编写技术文档,记录系统配置和操作程序。
更新于 2024-04-26