哔哩哔哩高级/资深数据开发工程师-商业化
社招全职2年以上技术类地点:上海状态:招聘
任职要求
要求: 1. 本科及以上学历,计算机、数学、统计等相关专业背景; 2. 2年以上数据开发经验,熟悉数据仓库和数据集市的设计和开发; 3. 熟悉大数据平台和工具,如Hadoop、Hive、Spark、Kafka、Flink等; 4. 熟悉数据仓库架构,了解数据仓库建模方法和技巧;
工作职责
1. 负责公司内部商业化数据的开发和维护,为产品和营销团队提供数据支持和分析服务; 2. 设计和开发商业化数据仓库和数据集市,实现数据的采集、清洗、存储和分析; 3. 负责数据架构的设计和维护,确保数据准确性、完整性和安全性; 4. 参与业务需求分析和数据建模工作,编写SQL语句完成数据提取、转换和加载(ETL); 5. 能够独立完成数据问题的排查和处理,解决数据质量和性能问题; 6. 具有良好的沟通能力和团队协作能力,与不同部门的业务人员和技术人员合作,推进数据项目的进展。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
相关职位
社招2年以上技术类
1、负责B站广告相关产品的前端开发工作,完整参与整个产品的设计与研发; 2、负责高质量编码设计,承担业务重点、难点的技术攻坚,保证高可靠、高性能、高可扩展的业务支撑; 3、参与前端基础架构建设与组件抽象,探索新技术并推动其落地,提升开发效率和系统稳定性; 4、参与新产品优化与演进,追求极致的用户体验。
更新于 2025-04-07
社招3年以上技术类
1. 参与商业化广告检索数据与样式相关服务的开发与设计,如实验平台,广告索引治理,样式中台建设等; 2. 参与常规需求迭代如广告新样式接入,增量索引开发,程序化DSP对接等;
更新于 2025-04-07