快手高级数据研发工程师(运营)-【数据平台】
社招全职5-10年D6264地点:北京状态:招聘
任职要求
1、较为丰富的数据仓库及数据平台架构经验,期望通过对业务的深入理解,进行数据仓库、数据体系和数据价值的建设和优化; 2、有从事分布式数据存储与计算平台应用开发经验,熟悉Hive,Kafka,Spark,Storm,Hbase,Flink 等相关技术并有相关开发经验; 3、有系统化的思维和工程化的能力,掌握JAVA和前端技术,有工程化落地的经验尤佳; 4、有较丰富的应用算法开发经验,对机器学习和AI有一定的了解; 5、3-5年及更长的经验均有需求。
工作职责
1、建设全站的基础数据能力,提供丰富、稳定的短视频社区公共基础数据,探索更多数据能力的增量价值; 2、支持运营方向各类数据专题体系的建设,通过数据+算法+产品,赋能业务,提供全链路、可分析、可复用的数据能力,提供更直观、更具分析指导性的产品化能力; 3、建设公司层面的核心数据资产,与业务场景深度结合,为社区服务提供数据服务化、数据业务化的数据&产品解决方案; 4、建设全站数据治理和管理体系,结合业务+元数据+技术,保障公司各个业务服务的数据质量和产出稳定。
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
Apache Storm+
[英文] Tutorial
https://storm.apache.org/releases/2.6.0/Tutorial.html
In this tutorial, you'll learn how to create Storm topologies and deploy them to a Storm cluster.
https://www.baeldung.com/apache-storm
This tutorial will be an introduction to Apache Storm, a distributed real-time computation system.
相关职位
社招3年以上技术类-数据
1.负责阿里巴巴国际事业部数据体系的建设,通过数据+算法+工程化,赋能业务,提供全链路、可分析的业务服务能力;可识别、可洞察的算法服务能力;配置化、可复用的数据技术能力;更直观、更具指导性的产品化能力 2.建设集团核心的数据资产,数据业务与新零售业务深度结合,提供人群运营、商品管理、品类运营、内容运营、线上线下联动运营等数据服务,利用数据、分析、算法、产品化等数据能力,为集团新零售场景提供数据服务化、数据业务化的整套数据及产品解决方案 3.建设数据中台的数据稳定性体系,建设丰富的技术+业务元数据,完善数据引擎和服务,聚焦在保障手段线上化、服务化和保障策略可演练这两个场景的开发;结构化业务场景,抽象通用业务逻辑,沉淀可复用的数据洞察能力,通过模版化和组件化提升数据架构扩展性,从而支持数据产品的快速迭代和横向扩展
更新于 2025-09-30
社招2年以上技术类-数据
1、建设国际数字商业集团商品数据资产体系,构建全网比价、选品、商机发现等核心能力,助力业务高效运转; 2、协同产技,通过数据+算法+工程化能力,提供数据洞察与产品化解决方案,提升业务数据化运营能力。
更新于 2025-05-26
社招3年以上技术类-数据
1、建设淘海外标准化数据体系,沉淀高质量数据资产,助力业务高效运转; 2、建设归因分析能力与AB测试数据能力,打造业务的核心决策数据产品,助力业务高效决策; 3、协同产技,通过数据+算法+工程化能力,提供数据洞察与产品化解决方案,提升业务数据化运营能力。
更新于 2025-08-25