小红书数据仓库高级专家(实时方向)
社招全职5年以上数据仓库地点:上海 | 北京状态:招聘
任职要求
本科及以上学历,计算机科学、软件工程或相关领域。 5年以内的实时数据处理和流处理系统开发经验。 精通至少一种实时数据处理框架,如Apache Kafka、Apache Flink,有流批一体处理经验更佳。 熟悉数据仓库技术,具备StarRocks、Clickhouse等OLAP引擎使用经验。 具备良好的编程能力,熟练使用Java、Python或Scala等编程语言。
工作职责
负责实时数据处理和分析系统的开发和维护,确保数据流的高效和稳定。 设计和实现高吞吐、低延迟的实时数据处理流程。 与业务团队合作,开发实时数据链路以及建设实时数仓。 参与实时数据平台的架构设计和性能优化。
包括英文材料
学历+
Apache+
https://www.apache.org/
The Apache® Software Foundation (ASF) provides software for the public good, guided by community over code.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Scala+
相关职位

社招5年以上技术
1、能够负责金融信贷业务板块实时数据仓库与离线数据仓库的需求管理、架构设计、模型建设和数据研发工作,保证数据服务的稳定性和准确性; 2、能够对数据仓库团队初/中级人员在数据仓库建模、数据治理、金融业务等方向进行培养; 3、能够通过数据资产治理、数据需求交付时效提升等方式实现数据仓库工作的降本提效。 4、能够与上下游紧密协作,为金融商信贷业分析、业务决策、业务运营、数据产品等提供有效数据支撑,对信贷业务赋能; 5、能够与行业先进的数据技术对标,采用最佳技术实践解决业务数据需求痛点。
更新于 2025-02-05
社招J5LM1
1、面向字节跳动旗下相关产品线,为业务指标建设提供支持和指导; 2、建设PB级数据仓库,参与负责数据仓库设计、建模、研发等; 3、建设ETL数据管道及自动化的ETL数据管道系统; 4、建设离线、在线、实时相结合的指标数据处理专家系统。
更新于 2019-07-28
社招7年以上技术类-数据
● 我们正站在AI+数据驱动本地生活服务变革的最前沿,致力于通过前沿AI技术与数据能力重构餐饮、零售、到店服务等核心场景的业务逻辑,打造一个连接数亿日活用户与数千万商家的智能生态闭环。 ● 作为核心数据架构与平台建设的关键角色,您将主导构建高德面向“目的地服务”的全链路数据资产体系,推动数据治理、分层建模、资产沉淀与高效应用,打造支撑未来5-10年业务增长的数据底座。您将深度参与并主导基于AI驱动的数据平台升级,赋能商家经营、销售作业、运营洞察、管理分析等关键场景,打造数据智能化产品能力矩阵,实现从“数据可用”到“数据好用”再到“数据驱动”的跃迁。 ● 同时,您将面对海量业务数据资产的治理挑战与架构演进机遇,持续提升数据平台在准确性、稳定性、实时性、扩展性等方面的综合能力,打造行业领先的数据应用平台。
更新于 2025-08-12