快手数据仓库开发工程师
社招全职3年以上D6254地点:北京状态:招聘
任职要求
1、本科及以上学历,硕士博士优先,计算机、数学、软件等相关专业; 2、熟悉数据仓库实施方法论、深入了解数据仓库体系,并支持过实际业务场景; 3、熟悉至少一种离线&实时计算引擎:Storm, Spark streaming, Flink,Hive;对OLAP产品优化理解有较为深刻的认知; 4、3年以上数据仓库、大数据开发经验,具备丰富的实时数据体系建设经验; 5、责任心强,善于沟通,对业务敏感,能快速理解业务背景,具备优秀的技术与业务结合能力。
工作职责
1、负责流量公共数据团队下离线数仓建设 或 实时数据体系的架构设计与开发落地; 2、对数据系统和数据服务的性能和稳定性进行持续优化迭代; 3、深入业务,理解并合理抽象业务诉求,发挥数据价值,与业务团队紧密合作; 4、打造行业领先的流量领域数据仓库体系,发挥数据价值。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
相关职位
社招3年以上JKYE1
1、主导或参与财经业务分布式数据仓库的搭建与运营; 2、主导或参与企业数据资产公共层建设,从工具和效果上实现敏捷智能的目标; 3、深入了解业务,从数据治理层面发现业务和系统方面的问题,实现数据治理闭环,保障数据质效。
更新于 2020-11-30
社招3年以上A221722
1、负责飞书People产品线核心业务离线&实时数据仓库构建; 2、负责维度模型的设计和大数据开发,解决数据任务性能优化、质量提升等技术问题; 3、负责打通不同业务线数据内容,形成统一数据模型; 4、负责全产品线数据治理,提升数据资产质量。
更新于 2024-01-17
社招2年以上A64928A
1、负责多媒体网络音视频质量数据开发、调优、运维等工作,构建数据仓库体系; 2、负责数仓模型设计、ETL开发,海量数据下的性能调优,以及复杂业务场景下的需求交付; 3、参与数据治理,面对PB级存量数据和万亿条级别的新增数据量,提升数据易用性及数据质量,降低数据处理成本; 4、深入业务,理解并合理抽象业务需求,沉淀高质量体系化的数据资产,为业务赋能。
更新于 2025-04-21