快手数据研发工程师-【综效线-企业应用部】
社招全职D13359地点:北京状态:招聘
任职要求
1、本科及以上学历,计算机、数学、统计学等相关专业优先; 2、有数仓建设经验,熟悉维度建模,并对数仓分层有较深理解,具备复杂数据需求开发能力; 3、精通Hive SQL及调优,熟练使用Java、Shell、Python等编程语言,掌握常见UDF开发,兼具实时开发能力(Flink)者优先; 4、熟悉Hive/Spark/Hadoop/Clickhouse/Flink/Kafka等大数据生态组件,有搭建、维护集群经验者优先; 5、执行力强,具有良好的沟通能力、组织能力及团队协作精神。
工作职责
1、负责快手职能业务数据研发工作,构建企业内部运营数据; 2、保障离线、实时数据及时、稳定产出,保障数据质量,并有机会负责部分业务线数据从0-1的设计、开发及效能评估; 3、维护数据开发相关的计算、存储、调度集群,建设数据开发相关提效工具。
包括英文材料
学历+
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Bash+
[英文] The Bash Guide
https://guide.bash.academy/
A quality-driven guide through the shell's many features.
https://www.youtube.com/watch?v=tK9Oc6AEnR4
Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
相关职位
社招3年以上D7494
企业应用部-平台服务端研发的核心产品(Kaleido)是一套低代码aPaaS平台解决方案,专注于企业应用领域研发提效的平台。团队集结了低代码平台领域、企业应用领域及各大厂专业优秀人才,具有非常好的成长环境。平台自2023年发布以来完成了快手4个业务领域内的33个系统交付,处于高速迭代和交付并行的阶段,在这里有非常好的学习机会和成长空间,期待渴望成长,喜爱探索的同学共同创造新的标杆产业! 职责和任务: 1、参与软件需求分析和系统设计,确保系统满足业务需求; 2、设计、开发和维护高质量的Java应用程序,追求代码的可读性和可维护性; 3、深入思考如何优化系统性能、稳定性和可扩展性,并实施相应的技术解决方案; 4、在低代码平台、企业数据平台和权限平台等领域发挥您的专业经验,为项目提供技术指导; 5、与跨职能团队合作,协调项目进展,确保交付按时完成; 6、持续研究新技术趋势,为团队引入创新想法和最佳实践。
更新于 2024-12-03
社招D12920
1、参与后端系统需求分析和设计,负责后端数据处理相关开发工作; 2、设计和实现高效的数据存储和查询方案,优化数据库性能; 3、开发和维护后端数据处理逻辑,保证系统数据处理的完整性和稳定性; 4、负责后端系统的性能优化和安全防护,确保系统的高效运行和数据安全。
更新于 2024-12-03
社招3年以上D4363
1、负责财务领域税务方向的核心项目产品开发工作; 2、负责核心产品模块的设计和研发,保证线上产品的性能、稳定性和研发质量的持续提升; 3、负责重点项目的需求分析、技术验证、技术方案设计和编码,发现和攻克存在的技术难点问题。
更新于 2025-02-18