美团美团平台-资深数仓开发
社招全职6年以上核心本地商业-基础研发平台地点:北京状态:招聘
任职要求
1.具有6年以上数据开发经验,理解Hadoop、Hive、Spark、Flink架构,理解数据分布式计算框架和过程,熟练掌握大数据底层技术原理和调优方法; 2.有丰富的业务数据需求支持经验,有针对某一领域或主题级别的业务信息收集和业务抽象能力,能够发现、定义业务问题; 3.理解范式模型和维度模型等仓库模型理论深,有复杂业务场景下数据架构落地实践经验; 4.熟练掌握Doris、ES、Druid、Clickhouse等至少一种OLAP工具原理和应用,具备结合工具的数据结构设计和调优能力; 5.心态开放,保持好奇心,有自驱力。 具备以下条件优先 参与过复杂业务从零到一的全流程数据仓库设计与实现; 掌握数据治理方法论,有团队管理经验;
工作职责
1.负责美团平台的用户增长、美团游戏、美团直播、美团会员、美团首页产品、服体等业务方向的基础数据仓库和应用层数据集市的设计及开发; 2.支持美团平台的公司级数据产品开发,深度参与公司级流量、用户数据口径管理和数据治理,负责数据生产力工具设计和研发; 3.负责数据仓库OLAP体系搭建,建设海量数据高效、灵活的在线分析应用; 4.业务数据需求的统一接口人与综合解决方案提供方,对外提供一站式服务,跨团队沟通、推动数据生产链路上的问题改进; 5.深入理解美团平台的C端业务场景,帮助美团App完成业务目标; 6.作为一个业务方向的数仓负责人,支持美团平台业务部门数据需求,不断提升需求支持效率、提高数据开发效率、提升数据质量、降低数据成本。
包括英文材料
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
ElasticSearch+
https://www.youtube.com/watch?v=a4HBKEda_F8
Learn about Elasticsearch with this comprehensive course designed for beginners, featuring both theoretical concepts and hands-on applications using Python (though applicable to any programming language). The course is structured in two parts: first covering essential Elasticsearch fundamentals including index management, document storage, text analysis, pipeline creation, search functionality, and advanced features like semantic search and embeddings; followed by a practical section where you'll build a real-world website using Elasticsearch as a search engine, working with the Astronomy Picture of the Day (APOD) dataset to implement features such as data cleaning pipelines, tokenization, pagination, and aggregations.
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
相关职位
社招3年以上网易云音乐
1、负责音乐离线数据仓库的研发,通过合理的数据架构,保障内外数据的准确性、一致性和稳定性,包括数据清洗、模型设计、数据治理及稳定性保障; 2、深入理解业务,通过对业务策略的洞察,收敛业务数据需求,提供系统性的解决方案并落地; 3、与数据分析师合作推动数据为产品运营赋能,通过技术创新让数据为业务发展带来价值。
更新于 2025-03-12
社招5年以上技术
1、能够独立负责金融某一业务板块实时数据仓库与离线数据仓库的需求管理、架构设计、模型建设和数据研发工作,保证数据服务的稳定性和准确性; 2、能够对数据仓库团队初/中级人员在数据仓库建模、数据治理、金融业务等方向进行培养; 3、能够通过数据资产治理、数据需求交付时效提升等方式实现数据仓库工作的降本提效。 4、能够与上下游紧密协作,为金融商业分析、业务决策、业务运营、数据产品等提供有效数据支撑,对业务赋能; 5、能够与行业先进的数据技术对标,采用最佳技术实践解决业务数据需求痛点。
更新于 2025-08-12
社招3年以上网易云音乐
1、负责音乐离线数据仓库的研发,通过合理的数据架构,保障内外数据的准确性、一致性和稳定性,包括数据清洗、模型设计、数据治理及稳定性保障; 2、深入理解业务,通过对业务策略的洞察,收敛业务数据需求,提供系统性的解决方案并落地; 3、与数据分析师合作推动数据为产品运营赋能,通过技术创新让数据为业务发展带来价值。
更新于 2024-12-04