美团项目实习-酒店旅行-数据开发-北京
实习兼职核心本地商业-基础研发平台地点:北京状态:招聘
任职要求
1.具有扎实的计算机专业知识,极强的问题解决能力; 2.掌握数据仓库的经典建模方法,熟悉不同建模方法的优劣,2年以上的数仓开发经验; 3.掌握大数据生态技术栈,具备较丰富的Hadoop、Hive、doris、Spark、flink、kafka等大数据工具应用和开发经验; 4.扎实的SQL功底,了解不同框架下SQL执行的原理,有过性能优化的实际…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、负责酒店旅行C端业务数据仓库体系建设,业务数据建模,数据应用产品的研发及管理工作; 2、负责数据仓库ETL流程设计、开发和优化,解决提升数仓体系生产效率与治理能力; 3、负责数据仓库OLAP体系搭建,建设海量数据高效、灵活的在线分析应用; 4、负责实时数仓的建设,使用系统化建设的思维提升实时研发效率,降低生产运维成本 5、参与挖掘用户相关特征和行为模型,转化数据价值。
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
还有更多 •••