美团数据开发高级工程师
社招全职3年以上核心本地商业-基础研发平台地点:北京状态:招聘
任职要求
1. 计算机相关专业背景,具备扎实的技术功底和出色的解决问题能力。 2. 掌握数据仓库的经典建模方法,熟悉不同建模方法的优劣,三年以上的数仓开发经验。 3. 掌握主流大数据技术栈和AI Coding使用经验,具备较丰富的Hadoop、Hive、Doris、Spark、Flink、Kafka等大数据工具应用和开发经验。 4. 扎实的SQL功底,了解不同框架下SQL执行的原理,有过性能优化的实际经验。 5. 具备优秀的业务理解能力和团队协作精神。 6. 保持技术热情,具有自我驱动能力。 具备以下条件优先 1. 有一定系统开发经验,能够使用Java、Python等语言进行编程。 2. 有体系化大数据治理和推动大型项目落地的相关实践经验。
工作职责
1. 负责公司级用户标识体系的设计、开发和维护,确保用户标识体系能力符合业务发展的需求。 2. 负责公司级用户标识关联数据的迭代、优化和维护,满足业务的各类需求。 3. 负责用户标识策略的数据开发工作,验证策略有效性,推动策略上线。
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
相关职位
社招1年以上A92812
1、负责剪映业务数据体系的规划设计和建设,通过数据产品和数据服务等方式,实现数据驱动业务; 2、负责剪映业务的离线数仓、实时数仓、数据服务化的设计、开发、性能优化,为上层分析和挖掘提供可靠、统一的离线和实时数据服务; 3、负责剪映数据分析平台建设,面向产品、运营、分析师等提供体验良好的万亿规模的交互式、可视化分析工作台; 4、负责离线、实时的ETL工作,为业务提供定制化的数据支持,并优化计算任务性能。
更新于 2025-06-17
社招5-10年
1、负责综合域的数据分析挖掘工作,主导关键业务问题的分析建模,构建可解释的归因模型或预测模型。 2、负责利用数据挖掘和算法模型等手段驱动业务策略,输出分析报告,制定技术落地方案,推动机会点的落地。 3、负责综合内部数据资产建设,存量表及新增表管理,以及日常的运维工作;
更新于 2025-08-05
社招
1、负责AI大数据平台的构建与优化,确保数据处理的高效性和稳定性。 2、参与数据模型的研发,提升数据分析和业务决策的能力。 3、与团队合作,针对特定业务需求进行数据解决方案的设计与实施。
更新于 2025-03-05