logo of nvidia

英伟达Senior Solutions Architect - KV Cache and AI Storage

社招全职地点:北京状态:招聘

任职要求


• Bachelor's degree or higher in Computer Science or a related field with strong systems or storage background.
• 5+ years of relevant experience, including 2+ years passionate about KV stores/caches or storage backends.
• Hands‑on experience with distributed storage, caching, or large‑scale backend systems.
• Solid understanding of Transformer / LLM inference and KV cache concepts, plus experience with at least one LLM serving stack (for example vLLM, TensorRTLLM or SGLang).
• Strong knowledge of NVMe SSDs, KV SSDs, and modern storage servers, including controller/firmware behavior and I/O characteristics.
• Practical experience with tiered memory and KV cache optimizations such as offloading (HBM → DRAM → NVMe), eviction/selection strategies, compression/quantization, or attention‑level optimizations.
• Familiarity with at least one large‑scale storage or caching system (such as Ceph, Redis, Cassandra, RocksDB‑based KV, object storage, or distributed logs).

Ways to stand out from the crowd:
…
登录查看完整任职要求
微信扫码,1秒登录

工作职责


• Lead technical exploration with customer architects to understand models, frameworks, SLOs, and KV cache usage patterns.
• Build end-to-end KV cache solutions using tiered memory and NVIDIA modern networking technologies.
• Analyze performance profiles, identify bottlenecks, and drive PoCs and benchmarks to validate improvements.
• Translate customer difficulties into clear feature requests and roadmap input for NVIDIA products.
• Build reference architectures, best-practice guides, and deliver tech talks to support our field teams and customers.
包括英文材料
Transformer+
大模型+
缓存+
vLLM+
TensorRT+
还有更多 •••
相关职位

logo of nvidia
社招

N/A

更新于 2026-04-04北京|上海
logo of nvidia
社招

• Define the end-to-end technical architecture for the NIM Factory, from container build systems and CI/CD to Kubernetes deployment patterns and runtime optimization. • Drive technical strategy and roadmap, making high-impact decisions on frameworks, technologies, and standards that empower dozens of engineering teams. • Architect and influence the design of workflow orchestration systems that underpin the NIM factory. • Coach and mentor senior engineers across the organization, fostering a culture of technical excellence, innovation, and knowledge sharing. • Champion best practices in software development, including API design, automation, observability, and secure supply chain management. • Collaborate with leadership across research, backend, SRE, and product to align technical vision with product goals and influence technical roadmaps.

更新于 2025-09-18上海
logo of nvidia
社招

• Primary responsibilities will include building AI/HPC infrastructure for new and existing customers. • Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting. • Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement. • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health. • Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.

更新于 2025-09-29北京
logo of nvidia
社招

• Design, implement, and optimize scalable ML training pipelines for training multimodal foundation models for robotics. • Collaborate with researchers to integrate cutting-edge model architectures into scalable training pipelines. • Implement scalable data loaders and preprocessors for multimodal datasets, such as videos, text, and sensor data. • Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets. • Develop robust monitoring and debugging tools to ensure the reliability and performance of training workflows on large GPU clusters.

更新于 2025-08-21上海