
平安科技IaaS运维高级工程师
社招全职5年以上计算机网络技术类地点:深圳状态:招聘
任职要求
1、本科及以上学历,计算机、通信、电子或相关专业; 2、5年以上服务器运维经验,具备大规模数据中心运维背景者优先; 3、精通Linux系统的安装、配置、优化与故障排查; 4、熟悉主流服务器品牌(如Dell、HPE、浪潮、华为)的硬件架构与管理工具(iDRAC、iLO、BMC等); 5、具备GPU服务器运维经验,熟悉NVIDIA GPU驱动、CUDA、NCCL、NVIDIA Driver、NVLink、GPUDirect等技术; 6、熟练掌握Shell/Python脚本编程,能编写自动化…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、负责大规模物理服务器集群(含GPU服务器)的部署、配置、监控、维护与优化,保障系统高可用性与稳定性; 2、主导GPU服务器(如NVIDIA A100/H100等)的部署、驱动安装、CUDA环境配置及性能调优,支持AI训练与推理业务; 设计并实施服务器自动化运维方案,包括系统初始化、固件升级、配置管理、批量部署等,提升运维效率; 3、负责服务器硬件故障诊断与处理,协调厂商进行维修与更换,建立完善的硬件生命周期管理机制; 4、搭建和维护服务器监控体系(如Prometheus、Zabbix、Grafana等),实现对CPU、内存、磁盘、GPU利用率、温度、功耗等关键指标的实时监控与告警; 5、配合DevOps团队实现CI/CD流程中对物理资源的自动化调度与管理; 6、编写和维护技术文档,包括部署手册、故障处理SOP、应急预案等; 7、参与机房基础设施规划,协助完成服务器上架、网络布线、电源管理等现场运维工作; 8、跟踪GPU、AI计算、高性能计算(HPC)等领域的最新技术趋势,推动运维体系持续演进。
包括英文材料
学历+
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
CUDA+
https://developer.nvidia.com/blog/even-easier-introduction-cuda/
This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA.
https://www.youtube.com/watch?v=86FAWCzIe_4
Lean how to program with Nvidia CUDA and leverage GPUs for high-performance computing and deep learning.
NCCL+
https://developer.nvidia.com/nccl
The NVIDIA Collective Communication Library (NCCL) implements multi-GPU and multi-node communication primitives optimized for NVIDIA GPUs and networking.
Bash+
[英文] The Bash Guide
https://guide.bash.academy/
A quality-driven guide through the shell's many features.
https://www.youtube.com/watch?v=tK9Oc6AEnR4
Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
脚本+
[英文] Scripting language
https://en.wikipedia.org/wiki/Scripting_language
https://zhuanlan.zhihu.com/p/571097954
一个脚本通常是解释执行而非编译。脚本语言通常都有简单、易学、易用的特性,目的就是希望能让程序员快速完成程序的编写工作。
还有更多 •••