快手数据研发工程师(电商生态)-【数据平台】
社招全职2年以上D11441地点:北京状态:招聘
任职要求
1、本科以上学历,两年以上大数据相关开发经验; 2、熟悉Linux平台,熟练使用Java、Python编程语言,编码基本功扎实; 3、有Hive、Kafka、Spark、Flink、HBase等两种以上两年以上使用经验; 4、熟悉数据仓库理论方法,并有实际模型设计及ETL开发经验,对于数据的架构和设计有一定的思考,具备良好的数学思维和建模思维; 5、熟悉分布式计算框架,掌握分布式计算的设计与优化能力,对Hadoop生态其他组件有一定了解,比如 HBase,Hadoop, Hive, Druid等 6、了解流式计算,熟悉至少一种实时计算引擎:Storm, Spark, Flink; 7、有很强的学习、分析和解决问题的能力,良好的团队合作意识,较强的沟通能力。 加分项: 有电商数据开发经验优先。
工作职责
1、负责快手电商数据仓库的建设,构建各垂直应用的数据集市; 2、负责快手电商新产品数据统计、报表产出、效果监测、归因分析和商务支持; 3、定义并开发业务核心指标数据,负责垂直业务数据建模; 4、根据业务需求,提供大数据计算应用服务,并持续优化改进; 5、参与埋点设计、数据生产全流程等技术体系建设和保障工作;
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
相关职位
社招2年以上
1. 负责客户服务体验领域相关数据仓库架构设计、数据开发及运维保障,建立高扩展性数据模型、高价值数据资产、高可用数据架构,满足不同用数场景的便捷性、稳定性需求; 2. 深入理解业务,挖掘业务数据,主动发现问题,并能够结合平台大数据和AI基础设施,提供数据产品/数据服务为业务运营提供高效支持; 3. 针对共性数据问题沉淀通用数据解决方案,面向业务垂直领域问题提供定制化数据解决方案。
更新于 2025-09-19
社招
1、负责核心业务域数据体系的规划和建设,通过数据产品和数据服务等方式,高效支撑业务场景的数据需求 2、深度理解业务,通过对业务策略和痛点的分析,制定系统性端到端的数据解决方案并落地 3、负责数据资产建设、数据质量与稳定性管理,构建共享融通的数据平台,让数据标准更规范、数据获取更高效
更新于 2025-05-23
社招1年以上
1、参与回收业务的数据模型设计和数据开发工作,包括业务分析、需求收集、数据分析、数据建模等; 2、完成数据仓库基础架构的搭建,包括数据架构、数据质量、性能优化等; 3、协助团队持续跟进产品在闲鱼生态中的业务发展,提供相应的数据支持,并提出相应的数据优化建议; 4、负责业务相关系统的数据分析、模型开发、系统调优等工作; 5、结合对业务的理解与思考,输出对业务和数据平台能力的思考、探索和创新;
更新于 2025-08-21