阿里巴巴基础设施与稳定性工程-高级基础平台研发专家-文件系统存储方向
社招全职8年以上技术类-开发地点:北京 | 杭州状态:招聘
任职要求
1. 熟悉大语言模型、生成式AI模型的训练、推理的I/O 特性及对存储系统的需求。 2. 熟悉大数据系统、机器学习系统领域内业界主流的持久化存储及缓存系统,…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1. 研发针对各AI训推业务的缓存加速系统,充分利用HBM、NVMe SSD等计算集群的高速存储介质及RDMA通信带宽,提高AI训推计算效率与性能,为集团AI业务的端到端的io性能、稳定性负责。 2. 在持久化存储基础上,利用计算集群的存储介质建设统一的日志文件系统。 3. 通过对文件存储层进行完善,强化文件系统存储能力,改善存储空间和数据读写速度,推动提高计算效率与性能。
包括英文材料
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
还有更多 •••