快手资深数据湖专家 -【杭州】
社招全职D7195地点:杭州状态:招聘
任职要求
1、本科及以上学历,计算机科学与技术、软件工程或相关专业方向; 2、熟练掌握Java/Rust,有扎实的计算机基础,对数据结构、算法基础有扎实掌握,对计算机体系结构、操作系统、计算机网络有深刻理解; 3、掌握数据湖理论与技术、文献,熟悉行业内成熟的数据湖服务,例如HUDI、Iceberg等; 4、掌握分布式计算系统基础理论、文献,熟悉行业内成熟的分布式计算框架与服务,例如Spark、Flink、Doris、K8s、Clickhouse,有大规模分布式计算服务研发与维护经验者优先; 5、熟悉向量化、索引设计、执行框架,SQL执行优化等技术者优先; 6、对技术充满热情,有较强的责任心和抗压能力,有较好的沟通能力,能快速融入团队,有较强的学习能力,能快速掌握最前沿的技术。
工作职责
1、打造行业领先的数据湖服务,提供高效、极简的EB级数据存储与处理能力。推进快手数据体系的全面的湖仓化; 2、建设向量化执行引擎,结合微体系结构特性,持续不断优化引擎的执行性能; 3、设计与研发数据自动化生产能力,持续降低数据生产成本; 4、跟进学术界,工业界成熟的经验与技术,规划与推进快手数据体系的不断演进与迭代。
包括英文材料
学历+
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Rust+
https://www.youtube.com/watch?v=BpPEoZW5IiY
In this comprehensive Rust course for beginners, you will learn about the core concepts of the language and underlying mechanisms in theory.
https://www.youtube.com/watch?v=lzKeecy4OmQ
Full Rust 101 Crash Course for beginners.
https://www.youtube.com/watch?v=rQ_J9WH6CGk
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
相关职位
社招D7195
1、打造行业领先的数据湖服务,提供高效、极简的EB级数据存储与处理能力。推进快手数据体系的全面的湖仓化; 2、建设向量化执行引擎,结合微体系结构特性,持续不断优化引擎的执行性能; 3、设计与研发数据自动化生产能力,持续降低数据生产成本; 4、跟进学术界,工业界成熟的经验与技术,规划与推进快手数据体系的不断演进与迭代。
更新于 2025-03-07
社招7年以上技术类-数据
● 我们正站在AI+数据驱动本地生活服务变革的最前沿,致力于通过前沿AI技术与数据能力重构餐饮、零售、到店服务等核心场景的业务逻辑,打造一个连接数亿日活用户与数千万商家的智能生态闭环。 ● 作为核心数据架构与平台建设的关键角色,您将主导构建高德面向“目的地服务”的全链路数据资产体系,推动数据治理、分层建模、资产沉淀与高效应用,打造支撑未来5-10年业务增长的数据底座。您将深度参与并主导基于AI驱动的数据平台升级,赋能商家经营、销售作业、运营洞察、管理分析等关键场景,打造数据智能化产品能力矩阵,实现从“数据可用”到“数据好用”再到“数据驱动”的跃迁。 ● 同时,您将面对海量业务数据资产的治理挑战与架构演进机遇,持续提升数据平台在准确性、稳定性、实时性、扩展性等方面的综合能力,打造行业领先的数据应用平台。
更新于 2025-08-12
社招8年以上智能与信息技术
1. 数据资产体系设计与搭建: 结合部门业务特点,负责设计并指导团队构建高效的数据资产体系; 2. 数据治理体系建设: 制定和完善数据治理策略,流程与规范,推动数据治理工作落地,提升数据资产价值,为业务提供可靠的数据支持; 3. 数据应用支持: 与算法团队紧密合作,深入了解数据应用需求,提供技术方案与实现路径; 4. 团队技术指导与协作: 作为技术专家,为团队成员提供技术指导与培训,提升团队整体技术水平,参与团队技术规划与决策,解决技术难题,确保项目顺利推进。