高德地图高德-数据仓库开发工程师/专家(北京)-信息工程团队
社招全职3年以上技术类-数据地点:北京状态:招聘
任职要求
1. 学历背景:计算机、统计学、数学、数据科学等相关专业本科及以上学历; 2. 技术能力: ○ 3年以上数仓建设经验,互联网公司背景 ○ 对hadoop、hive、hbase、spark、flink等大数据系统深入理解 ○ 有JAVA基础,具备UDF及接口服务相关开发经验 ○ 有实时数据研发经验 ○ 有类olap,clickhouse、doris、时序数据等相关经验加分
工作职责
1、负责高德广告业务的离线与实时数据仓库的构建和数据设计,包括数据模型设计、ETL研发、ETL性能优化。 2、负责高德广告业务流量、供给、客资、运营、结算等数据主题建设;能够深入理解业务并合理抽象,解决业务痛点问题,不断提升用数能力和数据分析能力,发挥数据价值。 3、负责高德全域数据的广告归因建设,持续优化归因策略,确保平台流量价值。 4、提升高德数仓团队数据架构能力和数据治理能力,持续提升需求支持效率、数据研发效率、提升数据质量、降低数据使用成本。
包括英文材料
学历+
数据科学+
https://roadmap.sh/ai-data-scientist
Step by step roadmap guide to becoming an AI and Data Scientist
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
相关职位
社招3年以上技术类-数据
1、负责高德广告业务的离线与实时数据仓库的构建和数据设计,包括数据模型设计、ETL研发、ETL性能优化。 2、负责高德广告业务流量、供给、客资、运营、结算等数据主题建设;能够深入理解业务并合理抽象,解决业务痛点问题,不断提升用数能力和数据分析能力,发挥数据价值。 3、负责高德全域数据的广告归因建设,持续优化归因策略,确保平台流量价值。 4、提升高德数仓团队数据架构能力和数据治理能力,持续提升需求支持效率、数据研发效率、提升数据质量、降低数据使用成本。
更新于 2025-04-10
社招D2303
1、负责快手商业化品牌营销相关平台大数据研发工作,建设面向广告主和达人的品牌资产平台,沉淀用户核心数据资产; 2、参与行业领先的商业化数据仓库的架构、规划与落地; 3、保障离线、实时数据及时、稳定产出,保障数据质量; 4、支持数据开发相关提效工具建设,持续提升数据开发效率。
更新于 2025-05-20
社招3-5年D6213
1、建设全站的基础数据能力,提供丰富、稳定的短视频社区公共基础数据,探索更多数据能力的增量价值; 2、各类数据专题体系(如社交、内容生产/消费、直播等)的建设,通过数据+算法+产品,赋能业务,提供全链路、可分析、可复用的数据能力,提供更直观、更具分析指导性的产品化能力; 3、建设公司层面的核心数据资产,与业务场景深度结合,为社区服务提供数据服务化、数据业务化的数据&产品解决方案; 4、建设全站数据治理和管理体系,结合业务+元数据+技术,保障公司各个业务服务的数据质量和产出稳定。
更新于 2025-03-07