快手高级数据研发工程师-【可灵AI专项】
社招全职3-5年J0012地点:北京状态:招聘
任职要求
1、有丰富的数据仓库及数据平台架构经验,通过对AI领域的深入理解,进行数据仓库、数据体系和数据价值的建设和优化; 2、有从事分布式数据存储与计算平台应用开发经验,熟悉Hive,Kafka,Spark,Storm,Hbase…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、建设可灵AI的基础数据能力,提供丰富、稳定的内容生成AI类产品的公共基础数据; 2、建设核心数据资产,与业务场景深度结合,为内容生成服务提供端到端的数据服务和数据解决方案; 3、建设数据治理和管理体系,结合业务+元数据+技术,保障公司各个业务服务的数据质量和产出稳定; 4、能够在业务成长期探索挖掘更多数据研发岗带来业务增量价值的机会; 5、有利用AI进行效率提升的能力; 6、有AI内容理解能力,并进行数据化落地的候选人优先考虑。
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Apache Storm+
[英文] Tutorial
https://storm.apache.org/releases/2.6.0/Tutorial.html
In this tutorial, you'll learn how to create Storm topologies and deploy them to a Storm cluster.
https://www.baeldung.com/apache-storm
This tutorial will be an introduction to Apache Storm, a distributed real-time computation system.
还有更多 •••