小红书数据引擎AIOps/Agent专家
社招全职3-5年数据引擎地点:北京 | 上海 | 杭州状态:招聘
任职要求
1、计算机相关专业。 2、熟悉一种或多种大数据引擎,如Flink、Spark、Ray、Kafka、ClickHouse、Doris等。负责过大规模的数据体量场景,并能结合引擎原理和业务场景进行优化和治理。 3、熟悉并落地如下1个或多个AIOps、Agent领域的经验: a). 大规模云平台的资源分配、调度优化和中长期资源规划:运用需求预测、运筹优化等方法,解决大规模混部环…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、负责大数据计算、存储引擎的稳定性、成本和效率方向工作,包括风险管理、变更管控、成本治理、性能优化、运维运营效率提升等。 2、深入理解上述平台的架构及其支撑的业务(搜推、广告、风控、AI等),帮助业务在稳定性、成本、效率等方向上做更好的架构设计,对生产问题进行诊断和优化,帮助不断提升数据+AI业务的价值。 3、负责大数据PaaS平台和AI Agent建设。探索和落地大数据领域Agent和AIOps技术风险领域的前沿技术和应用场景,包括智能问答、推理分析、容量规划、数据治理、业务诊断、风险预测等,并落地到业务场景,不断推动服务能力升级。
包括英文材料
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Ray+
https://github.com/ray-project/ray
Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://www.youtube.com/watch?v=FhXfEXUUQp0
In this video, I'll teach you everything you need to know about Apache Ray!
https://www.youtube.com/watch?v=fMiAyj2kgac
Using powerful machine learning algorithms is easy using Ray.io and Python.
https://www.youtube.com/watch?v=q_aTbb7XeL4
Parallel and Distributed computing sounds scary until you try this fantastic Python library.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
还有更多 •••