AMDAI Framework Engineer

社招全职 Engineering2025-12-02地点：上海状态：招聘

扫码手机上打开

任职要求

1. Expertise in Inference Frameworks: Proven, hands-on experience with vLLM or SGLang, including deep understanding of their source code, deployment, configuration, and performance tuning. (Please describe relevant projects in your resume). 2. Mastery of Model Architectures: In-depth understanding and practical experience with inference workflows of mainstream LLMs (e.g., DeepSeek, Qwen), including their tokenizers, model configurations, and architecture definitions. 3. Strong Theoretical Foundation: Solid grasp of the principles behind Transformer, Self-Attention, MoE, KV Cache, and their impact on inference performance. 4. Proven Optimization Experience: Familiarity with end-to-end LLM inference optimization techniques such as PagedAttention, FlashAttention, continuous/dynamic batching, and quantizatio…

登录查看完整任职要求

微信扫码，1秒登录

工作职责

Position Overview We are seeking a highly experienced engineer specializing in large language model (LLM) inference performance optimization. You will be a core member of our team, responsible for building and optimizing the LLM inference performance with high-throughput, low-latency on AMD Instinct GPUs. If you are passionate about pushing performance boundaries and have deep, hands-on expertise with cutting-edge technologies like vLLM or SGLang, we invite you to join us. Key Responsibilities 1. Core System Optimization: Lead the development, tuning, and customization of LLM performance optimization on AMD GPUs, leveraging and extending frameworks like vLLM or SGLang to address performance bottlenecks in production environments. 2. Performance Analysis & Tuning: Conduct end-to-end performance profiling using specialized tools. Perform deep optimization of compute-bound operators (e.g., Attention), memory I/O, and communication to significantly increase throughput and reduce latency. 3. Model Architecture Adaptation: Demonstrate expertise in mainstream LLM architectures (e.g., DeepSeek, Qwen, Llama, ChatGLM) and optimize inference for their specific characteristics (e.g., RoPE, SWA, MoE, GQA). 4. Algorithm & Principle Application: Leverage your deep understanding of core algorithms (Transformer, Attention, MoE) to implement advanced optimization techniques such as PagedAttention, FlashAttention, continuous batching, quantization, and model compression. 5. Technology Foresight & Implementation: Research and prototype state-of-the-art optimization techniques (e.g., Speculative Decoding, Weight-Only Quantization) and drive their adoption into production systems. Qualifications: Mandatory

📮 投递简历 ✨AI模拟面试

难度：

包括英文材料

大模型+

vLLM+

SGLang+

Transformer+

缓存+

还有更多 •••

登录查看完整学习资料

相关职位

AI Framework Engineer

社招 Enginee

THE ROLE:  As a core member of the team, you will play a pivotal role in optimizing and developing deep learning frameworks for AMD GPUs. Your experience will be critical in enhancing GPU kernels, deep learning models, and training/inference performance across multi-GPU and multi-node systems. You will engage with both internal GPU library teams and open-source maintainers to ensure seamless integration of optimizations, utilizing cutting-edge compiler technologies and advanced engineering principles to drive continuous improvement.

更新于 2025-10-25上海

AI Framework Engineer

社招 Enginee

THE ROLE:    As a core member of the team, you will play a pivotal role in optimizing and developing deep learning frameworks for AMD GPUs. Your experience will be critical in enhancing GPU kernels, deep learning models, and training/inference performance across multi-GPU and multi-node systems. You will engage with both internal GPU library teams and open-source maintainers to ensure seamless integration of optimizations, utilizing cutting-edge compiler technologies and advanced engineering principles to drive continuous improvement.

更新于 2025-12-09上海

AI Framework Eng.

社招 Enginee

THE ROLE:    As a core member of the team, you will play a pivotal role in optimizing and developing deep learning frameworks for AMD GPUs. Your strong experience will be critical in enhancing GPU kernels, deep learning models, and training/inference performance across multi-GPU and multi-node systems. You will engage with both internal GPU library teams and open-source maintainers to ensure seamless integration of optimizations, utilizing cutting-edge compiler technologies and advanced engineering principles to drive continuous improvement.

更新于 2025-12-08上海

AI Framework Eng.

社招 Enginee

更新于 2025-12-08上海