AMD图像视频生成算法实习生 (Jan - Jun 2026)
任职要求
Proficiency in at least one deep learning framework (such as TensorFlow, PyTorch, etc.) to design, implement, and optimize complex models. Excellent problem-solving and analytical skills, capable of working effectively and delivering results under high pressure. Strong…
工作职责
Location: Beijing THE ROLE: AMD is looking for an AI R&D intern to join our growing team. As a key contributor you will be part of a leading team to drive and enhance AMD’s abilities to explore the highest quality, academic/industry-leading technologies. THE PERSON: The ideal candidate possesses an innovative and problem-solving mindset, has a keen eye for Software engineering development, and is diligent and passionate about Technology. A successful candidate will need to employ strong knowledge in computer technologies, and SW engineering expertise as well as a strong ability to compete effectively in a fast-paced, relevant environment while working with different teams of engineers and collaborators. KEY RESPONSIBILITIES: Research the latest advancements and technologies in Generative AI, more specifically image/video/world generation, MLLM, designing and developing innovative applications aligned with company needs. Study the SOTA generation algorithms and enhance the accuracy and performance of existing models. Explore optimized deployment approaches ensuring efficiency in production environments. Collaborate with teams, share best practices, and provide guidance and support on Generative AI technologies.

1, 算法创新,探索扩散模型在图像视频生成领域,画质,动态性提升的方法 2,算法创新,探索扩散模型推理提速的蒸馏方法和无需训练的方法 3,业务支持,改进现有扩散模型以实现目前业务所需的一些特性,如提高人像一致性,长视频生成的稳定性,指令遵循能力等 4,业务支持,改进现有扩散模型以实现流式地生成
1.参与图像、视频生成相关领域研发工作,探索视觉生成领域前沿方向 2.参与图像生成与编辑、视频可控生成、多模态视觉生成、视觉生成领域强化学习等方向研究 3.分析和解决算法产品化过程中出现的效果、性能等问题 4.参与学术研究,产出影响行业的科研成果
1、参与快手kling多模态视频生成的研发和落地工作(实习生以发论文为主),包括但不限于: t2v,i2v等基础模型研发、多模态可控视频生成编辑、世界模型等; 2、探索将多模态大语言模型mllm如deepseek/qwen相关技术与视频生成相结合,包括但不限于:提升kling视频生成的多模态理解、推理、多轮交互能力等; 3、探索将语音和视频生成相结合,包括但不限于:语音驱动的视频生成,有声视频等; 4、探索实时可拓展的多模态视频生成技术,提升多模态视频生成的质量和效率等; 5、在顶会顶刊上发表研究成果和开源代码,提升团队在多模态视频生成等领域的学术声望。
1、参与快手kling多模态视频生成的研发和落地工作(实习生以发论文为主),包括但不限于: t2v,i2v等基础模型研发、多模态可控视频生成编辑、世界模型等; 2、探索将多模态大语言模型mllm如deepseek/qwen相关技术与视频生成相结合,包括但不限于:提升kling视频生成的多模态理解、推理、多轮交互能力等; 3、探索将语音和视频生成相结合,包括但不限于:语音驱动的视频生成,有声视频等; 4、探索实时可拓展的多模态视频生成技术,提升多模态视频生成的质量和效率等; 5、在顶会顶刊上发表研究成果和开源代码,提升团队在多模态视频生成等领域的学术声望。