
Keep数据研发工程师(J12251)
社招全职1-3年地点:北京状态:招聘
任职要求
1.本科及以上学历,具备1~3年数据仓库开发和实施经验,拥有大中型数据仓库架构、模型、ETL设计相关经验; 2.精通Python、Hive、Spark、Kafka、Flink等数据处理工具,具备较强的编程能力; 3.掌握并熟悉OLAP技术应用,如Starrocks/…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1.负责公司数据仓库的建设,提供数据和报表支持,能够追踪和评估数据使用效果,并进行持续优化和迭代; 2.对海量业务数据进行清洗、整理、分析,挖掘不同业务场景中的数据价值,为数据驱动的决策提供支撑; 3.参与数据治理工作,提升数据的易用性、准确性以及质量,确保数据满足业务需求; 4.与数据产品团队、工程运维团队紧密合作,确保项目保质保量完成,推动数据仓库项目的顺利实施。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
还有更多 •••