钉钉阿里云智能-SRE工程师(AI Agent产品)-上海
社招全职5年以上地点:上海状态:招聘
任职要求
1. 扎实的运维功底和编程能力,精通Linux系统与Shell,熟练掌握至少一门自动化脚本/工具开发语言(如 Go, Python); 2. 深入理解云原生技术栈,具备生产环境Kubernetes(K8s)集群的运维管理经验,并熟悉阿里云等主流公有云产品; 3. 熟悉主流监控体系(如Prometheus, Grafana),并具备大数据组件(如Flink, Kafka)或数据库(如MySQL, Redis)的运维经验; 4. 优秀…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1. 负责AI交易平台业务的云原生基础设施建设与运维,保障Kubernetes(K8s)平台及容器化应用的高可用、高性能; 2. 负责基础设施全生命周期管理,包括但不限于阿里云资源、Flink实时计算集群,以及AI应用所需的MCP服务、Runtime调度、模型服务等组件的部署、监控、优化与故障排查; 3. 参与SRE体系的架构设计与技术演进,通过IaC(基础设施即代码)、CI/CD等理念,主导或参与自动化运维平台/工具的开发,提升研发与交付效率; 4. 关注云原生及AI基础设施领域的技术发展趋势,并将其应用于稳定性保障、成本优化和效率提升的实践中。
包括英文材料
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
Bash+
[英文] The Bash Guide
https://guide.bash.academy/
A quality-driven guide through the shell's many features.
https://www.youtube.com/watch?v=tK9Oc6AEnR4
Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.
脚本+
[英文] Scripting language
https://en.wikipedia.org/wiki/Scripting_language
https://zhuanlan.zhihu.com/p/571097954
一个脚本通常是解释执行而非编译。脚本语言通常都有简单、易学、易用的特性,目的就是希望能让程序员快速完成程序的编写工作。
Go+
https://www.youtube.com/watch?v=8uiZC0l4Ajw
学习Golang的完整教程!从开始到结束不到一个小时,包括如何在Go中构建API的完整演示。没有多余的内容,只有你需要知道的知识。
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
Prometheus+
https://grafana.com/docs/grafana/latest/getting-started/get-started-grafana-prometheus/
Prometheus is an open source monitoring system for which Grafana provides out-of-the-box support.
https://prometheus.io/docs/tutorials/getting_started/
Prometheus is a system monitoring and alerting system.
Grafana+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
还有更多 •••