腾讯腾讯云-大模型推理优化专家
社招全职3年以上腾讯云技术地点:深圳状态:招聘
任职要求
1.本科及以上学历,丰富的推理场景KV Cache相关研发经验,对相关优化技术有深刻的理解和见解; 2.曾参与过线上大规模KV Cache统的研…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1.负责大模型推理引擎KV Cache子系统的设计,实现与维护; 2.负责KV Cache在新型大模型、异构硬件平台与新技术特性上的深度适配、兼容打通与底层调; 3.负责优化显存占用、碎片、复用率、命中率、延迟、吞吐核心指标提升性能,降低成本; 4.深度探索KV Cache业界与行业前沿技术,并能结合业务特点转换成业务价值。
包括英文材料
学历+
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE