美团外卖技术-大数据开发(基础数据方向)
社招全职核心本地商业-基础研发平台地点:北京状态:招聘
任职要求
1.掌握hadoop、hive、hbase、ES、Storm、spark、flink一种或者几种平台或组件的开发经验; 2.掌握数据仓库、ETL开发相关技术和原理,有实时计算经验优先; 3.掌握JAVA、Python或Scala语言之一,对数据结构和算法设计有较为深刻的…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1.根据外卖业务场景,与各业务团队深入合作,搭建满足外卖各业务团队,商分,运营,策略等团队日常运营及分析的运营数据体系; 2.负责规划和设计面向C端用户、B端的数据服务和数据工具系统或产品产品;协同业务方,PM,RD,QA等资源,完成外卖数据产品推动落地和持续迭代; 3.通过抽象建设覆盖交易、经营、补贴、供给、体验等主题的数据分析方法和分析思路,赋能业务发展。 4.负责外卖离线数仓和实时数仓建设,协同业务后线、运营,提升经营和运营效率。 5.负责画像平台系统能力建设,维护系统稳定性。
包括英文材料
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
ElasticSearch+
https://www.youtube.com/watch?v=a4HBKEda_F8
Learn about Elasticsearch with this comprehensive course designed for beginners, featuring both theoretical concepts and hands-on applications using Python (though applicable to any programming language). The course is structured in two parts: first covering essential Elasticsearch fundamentals including index management, document storage, text analysis, pipeline creation, search functionality, and advanced features like semantic search and embeddings; followed by a practical section where you'll build a real-world website using Elasticsearch as a search engine, working with the Astronomy Picture of the Day (APOD) dataset to implement features such as data cleaning pipelines, tokenization, pagination, and aggregations.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
还有更多 •••
相关职位
社招2年以上核心本地商业-美
1. 负责利用大模型(LLM)技术优化外卖、到餐等业务场景的供给理解与内容生成,主要工作包括: 1)基于大模型能力进行菜品、商家属性的智能标注与数据清洗,为算法模型提供高质量的标注数据和特征; 2)构建小美产品的用户画像体系,优化 AI 推荐效果; 3)持续优化大模型在数据处理环节的应用效果,推动AI生成内容在实际业务中的落地。 2. 负责数据产品开发,为小美产品运营决策优化提供数据支持,包括不限于: 1)搭建和维护小美产品的数据采集体系,完成数据清洗、加工和数仓建设; 2)开展产品功能迭代数据分析,输出优化建议,支持业务决策和产品优化。
更新于 2025-09-18北京
社招3年以上算法开发岗
1. 负责将机器学习与运筹优化相结合,在末端揽派场景进行精细化运营相关工作 2. 负责物流末端揽派场景规划相关算法,包括网点选择、配送范围规划、供需资源投入等 3. 负责物流末端揽派场景调度相关算法,包括快递员任务调度分配、作业模式设计等 4. 负责物流末端揽派场景基础数据挖掘,包括快递派送ETA/及时率预估与仿真等 5. 负责大模型在Agent助手和运筹求解领域的应用探索 6. 跟踪大模型、机器学习、数据挖掘等方向的前沿算法,承担技术框架建设和落地实施优化。
更新于 2025-09-14北京
社招3年以上技术类-开发
职位亮点 数据AI方向、数据规模大实时性高、兼具在线与离线场景、兼具后端工程与大数据处理 职位描述 1. 负责含AI取数、AI检索、AI诊断等多AIAgent协同架构设计与开发 2. 负责数据AI方向的大数据基建,为上层AIAgent提供稳定数据支撑 3. 负责数据流与大规模高并发特征服务建设,为算法模型提供高效便捷特征访问 4. 负责各在线、离线系统的稳定性和性能优化建设,保障其99.99%的高可用
更新于 2025-09-09北京