阿里云阿里云智能-服务器研发软硬件结合开发专家-深圳/北京
社招全职5年以上云智能集团地点:北京 | 深圳状态:招聘
任职要求
1、计算机/电子/数学等相关专业,精通C++/Python/Rust,具备5年以上高性能计算或GPU相关软件开发经验。 2、熟悉常用数据结构与算法;对计算机体系结构(如缓存/多线程/SIMD 等)有一定理解,能够深入性能瓶颈细节进行优化; 3、熟悉主流AI框架(PyTorch/TensorFlow)及优化工具(TensorRT-LLM、TVM),有LLM或扩散模型等生成式AI场景的调优经验。 4、对AI技术前沿有强烈兴趣,能快速学习并解决复杂技术问题,熟练掌握行业论文的分析能力; 5、具有优秀的逻辑思维能力,能够适应AI软硬件结合技术的快速变化,能跟上主流AI优化技术的发展节奏; 6、有强烈技术热情和好奇心,自驱力和学习力强;具备良好的分析与解决问题的能力、沟通以及团队合作能力; 7、喜欢挑战性的技术研发工作,善于攻坚…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、参与视觉生成/多模态模型(包括文本、图像、视频生成等)在 GPU、ASIC、FPGA 等异构硬件上的推理/后训练加速开发与软硬件结合的性能优化工作,包括但不限于模型量化、attention优化、显存优化、编译优化、计算与通信优化、内存管理以及多卡或多设备的并行推理方案等; 2、在主流深度学习框架(如 PyTorch)基础上,基于GPU/xPU硬件特点,对关键算子进行软硬件结合优化,提升模型运行效率; 3、与硬件以及算法工程师紧密配合,共同优化整体推理速度与资源占用; 4、跟踪学术界与工业界前沿技术(如扩散模型优化、VAE并行优化、AI编解码、面向机器的编解码等),推动软硬件协同创新。
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Rust+
https://www.youtube.com/watch?v=BpPEoZW5IiY
In this comprehensive Rust course for beginners, you will learn about the core concepts of the language and underlying mechanisms in theory.
https://www.youtube.com/watch?v=lzKeecy4OmQ
Full Rust 101 Crash Course for beginners.
https://www.youtube.com/watch?v=rQ_J9WH6CGk
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
多线程+
https://liaoxuefeng.com/books/java/threading/basic/index.html
和单线程相比,多线程编程的特点在于:多线程经常需要读写共享数据,并且需要同步。
https://www.youtube.com/watch?v=_uQgGS_VIXM&list=PLsc-VaxfZl4do3Etp_xQ0aQBoC-x5BIgJ
https://www.youtube.com/watch?v=IEEhzQoKtQU
https://www.youtube.com/watch?v=mTGdtC9f4EU&list=PLL8woMHwr36EDxjUoCzboZjedsnhLP1j4
https://www.youtube.com/watch?v=TPVH_coGAQs&list=PLk6CEY9XxSIAeK-EAh3hB4fgNvYkYmghp
https://www.youtube.com/watch?v=xPqnoB2hjjA
This video is an introduction to multithreading in modern C++.
https://www.youtube.com/watch?v=YKBwKy5PrpQ
Rust threading is easy to implement and improves the efficiency of your applications on multi-core systems!
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
还有更多 •••
相关职位
社招5年以上云智能集团
1.负责智算场景的DPU系统的研发、交付和运维; 2.负责阿里云存储和网络业务在DPU系统的卸载加速和性能优化等; 3.通过智算场景的软硬件创新与优化,帮助阿里云智算基础设施持续提升技术竞争力、
更新于 2025-12-06深圳|杭州
社招5年以上云智能集团
1. 负责智能网卡的网卡驱动和固件的研发交付以及问题定位; 2. 负责智能网卡在AI智算,存储等领域软硬件结合优化,创新研究; 3. 通过智能网卡的软硬件创新与优化,帮助阿里云智算基础设施持续提升技术竞争力。
更新于 2026-03-23深圳|杭州
社招5年以上云智能集团
1、设计并实现高效的AIGC工程/图像/视频处理软硬件一体化方案,参与媒体计算产品全生命周期开发。 2、负责系统性能调优,识别并解决关键瓶颈,提升稳定性与效率。 3、开发和维护底层驱动、基础软件及图像/视频SDK,确保硬件(ASIC/FPGA/GPU)与应用高效协同。
更新于 2025-09-08深圳