美团商业增值-大数据开发工程师(广告方向)
社招全职5年以上核心本地商业-业务研发平台地点:北京 | 上海状态:招聘
任职要求
1. 计算机或相关专业本科及以上学历,计算机基础知识扎实,5年以上数据开发相关工作经验; 2. 精通数据仓库建模理论、范式建模、维度建模的方法,精通ETL和SQL开发,有实际的数仓建设经验和大数据开发经验; 3. 熟练使用Hive、Spark、Flink等离线数仓和实时数仓计算框架,并深入知晓原理; 4. 熟练使用Java、Mysql进行数据系统开发,熟练掌握Spring MVC、Spring boot等开发框架,有常用的中间件工程实践经验比如消息队列、缓存系统等 5. 熟练掌握数据仓库的元数据管理、数据治理、数据质量建设等方法,并有实际的落地经验; 6. 优秀的沟通表达能力和团队协作能力,良好的分析问题和解决问题的能力; 7. 优秀的业务理解能力,能够站在业务角度分析问题和解决问题。 具备以下条件优先 1.熟悉互联网广告业务,有广告数据建设经验者优先 2.有互联网大数据量复杂场景的数据仓库建设经验,熟悉数据治理和管理者优先 3.有Clickhouse、Doris等OLAP和Hudi等数据湖系统经验者优先 4.有标签和画像相关数据系统建设经验者优先
工作职责
1.负责广告业务数据集市建设,通过标准化的数据建设提升数据的稳定性和高交付效率 2.负责数据服务建设,提供稳定、可靠的数据服务,支撑业务系统迭代 3.负责数据产品建设,为广告运营和决策提供稳定的数据支持
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
MySQL+
https://juejin.cn/post/7190306988939542585
这是一篇 MySQL 通关一篇过硬核经验学习路线,包括数据库相关知识,SQL语句的使用,数据库约束,设计等。
[英文] MySQL Tutorial
https://www.mysqltutorial.org/
your go-to resource for mastering MySQL in a fast, easy, and enjoyable way.
https://www.youtube.com/watch?v=5OdVJbNCSso
MySQL SQL tutorial for beginners
https://www.youtube.com/watch?v=7S_tz1z_5bA
This beginner-friendly course teaches you SQL from scratch.
Spring+
https://liaoxuefeng.com/books/java/spring/index.html
Spring是一个支持快速开发Java EE应用程序的框架。它提供了一系列底层容器和基础设施,并可以和大量常用的开源框架无缝集成,可以说是开发Java EE应用程序的必备。
https://spring.io/guides/gs/rest-service
https://spring.io/quickstart
Level up your Java code and explore what Spring can do for you.
Spring Boot+
https://spring.io/guides/gs/spring-boot
his guide provides a sampling of how Spring Boot helps you accelerate application development.
https://www.youtube.com/watch?v=Nv2DERaMx-4&list=PLzUMQwCOrQTksiYqoumAQxuhPNa3HqasL
The author teaches you how to use Spring Boot from a complete beginner, to building a REST API with a real database, Dockerising it and deploying it to the cloud.
开发框架+
[英文] Understanding Modern Development Frameworks: A Guide for Developers and Technical Decision-makers
https://www.freecodecamp.org/news/understanding-modern-development-frameworks-guide-for-devs/
中间件+
https://www.youtube.com/watch?v=1oWPUpMheGk
消息队列+
https://www.youtube.com/watch?v=xErwDaOc-Gs
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
相关职位
社招3年以上核心本地商业-业
1.负责广告召回系统中投放模型、实体模型等核心数据的计算及索引构建 2.负责广告BC端数据链路中批流一体计算框架的架构设计、功能开发、运维优化 3.负责系统中数据平台的架构设计、功能开发、运维优化 4.负责系统的监控工具、排查工具,diff工具等提效工具的建设
更新于 2025-05-06
社招核心本地商业-业
1. 负责商业增值业务场景中大模型的技术落地,支持业务目标提升; 2. 负责大模型在智能助手、电销、IM等交互式对话场景中的应用,提升场景的自动化与智能化水平,改进商家的交互体验,提升广告供给; 3. 负责大模型在经营诊断分析、多模态创意生成等内容生成类场景中的应用,降低平台和商家的运营成本,提升运营效率; 4. 负责大型语言模型的微调、偏好对齐、知识增强等技术探索,积极跟进AIGC业内应用趋势,包括并不限于多模态、RLHF、Agent等方向; 5. 与其他团队密切合作,包括数据工程师、前后端工程师、产品经理等,实现高质量的产品和解决方案。
更新于 2024-11-19
社招3年以上核心本地商业-业
1、广告主投放及增值平台建设,构建统一易扩展的广告预算、出价、创意、效果数据、合约等系统能力,快速支撑美团多业务(外卖/闪购/医药)、多客户类型(中小/KA/品牌商)、多产品线的商业化需求; 2、广告供给运营平台建设,支撑各类广告主运营及营销策略,构建包含诊断扶持、投放建议、营销活动、任务体系和销售体系等的运营矩阵,与广告业务一起共同完成供给目标; 3、优化广告计费结算系统,通过改进系统架构、简化流程、对抗演练等手段,保障系统稳定承载更高吞吐、更低延迟的计费流量; 4、完善面向广告全链路的业务保障体系,增强广告业务及系统稳定性,包含投放数据正确性保证、广告计费资金安全保障、统一业务监控诊断定位等方向。
更新于 2025-05-14