AMD算子开发实习生 AI/ML Kernel Optimization Intern
实习兼职地点:上海状态:招聘
任职要求
You are currently enrolled in a China based University in a Master's program in Computer Science, Computer Engineering, or a related field. If you have knowledge/experience with any of the following technical skills (or related areas) and are enthusiastic about this role, we strongly encourage you to apply: Machine Learning & Data Science: Exposure to machine learning algorithms, data analysis, computer vision, etc. through coursework or projects. Programming Languages: Strong programming skills in C++ with a focus on writing clean, efficient, and scalable code. Machine Learning Frameworks: Practical experience with machine learning librar…登录查看完整任职要求
微信扫码,1秒登录
工作职责
An exciting internship opportunity to make an immediate contribution to AMD's next generation of technology innovations awaits you! We have a multifaceted, high-energy work environment filled with a diverse group of employees, and we provide outstanding opportunities for developing your career. During your internship, our programs provide the opportunity to collaborate with AMD leaders, receive one-on-one mentorship, attend amazing networking events, and much more. Being part of AMD means receiving hands-on experience that will give you a competitive edge. Together We Advance your career! JOB DETAILS: Location: Shanghai, China Onsite/Hybrid: This role require the student to work at least 3 days/week, either in a hybrid (minimum 3 Days in Office) or onsite work structure throughout the duration of the co-op/intern term. Duration: Jan - June 2026 WHAT YOU WILL BE DOING: We are seeking a highly motivated Machine Learning (ML)/Artificial Intelligence (AI) intern/co-op to join our team and contribute to the development of next-generation product differentiation features alongside expert ML/AI engineers. In this role, you will: Gain hands-on experience with cutting-edge technologies in ML, AI, and High-Performance Computing. Learn to analyze and optimize GPU Kernel to maximize performance for specific AI operations. Contribute to projects such as: Researching, developing, and deploying machine learning and computer vision solutions for AMD's current and future products. Work closely with internal teams to analyze and improve training and inference performance on AMD GPUs. Design and optimize deep learning models specifically for AMD GPU performance. Assisting AI software teams with roadmap planning, collateral development, and customer engagements. Engage with framework maintainers to ensure code changes are aligned with requirements and integrated upstream. Apply sound engineering principles to ensure robust, maintainable solutions.
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
还有更多 •••
相关职位

实习算法序列
1.探索基于大语言模型(LLM)的Coding Agent开发范式,推动**AI驱动的算子开发流程(Human + Agent协同)**落地 2.使用AI工具(如 Claude Code / Cursor / 自研Agent)参与高性能算子开发,包括: a.GPU方向:CUDA / C++算子开发与性能优化(访存、并行度、kernel fusion等) b.BPU方向:地平线BPU算子开发与优化(编译约束、算子映射、数据流优化等) 3.参与构建AI辅助算子开发体系,包括: a.Prompt设计与Agent workflow搭建 b.自动代码生成、自动调优、自动benchmark与回归验证 c.结合profiling工具进行性能分析与优化闭环 4.参与大模型推理系统中的关键算子优化(Attention / KV Cache / MoE等) 5.参与跨硬件平台的算子适配与优化(GPU ↔ BPU) 6.沉淀技术文档与最佳实践(CLAUDE.md、Skill等)
更新于 2026-04-02北京|南京|上海
实习D11722
1、参与研发业内领先的低延迟、高吞吐的大模型推理优化方案,优化目标包括视频生成大模型、多模态大模型、语言大模型等; 2、参与调研并复现大模型推理优化方向最新论文,具体方向包括高性能算子开发、大模型量化、分布式大模型并行推理等; 3、参与组内服务化框架开发,提升大模型服务部署效率。
更新于 2025-05-20北京