阿里云阿里云智能-服务器软硬件结合开发专家-深圳
社招全职5年以上云智能集团地点:深圳状态:招聘
任职要求
1、计算机/电子/数学等相关专业,精通C++/Python/Rust,具备5年以上高性能计算或GPU相关软件开发经验。 2、熟悉常用数据结构与算法;对计算机体系结构(如缓存/多线程/SIMD 等)有一定理解,能够深入性能瓶颈细节进行优化; 3、熟悉主流AI框架(PyTorch/TensorFlow)及优化工具(TensorRT-LLM、TVM),有LLM或扩散模型等生成式AI场景的调优经验。 4、对AI技术前沿有强烈兴趣,能快速学习并解决复杂技术问题,熟练掌握行业论文的分析能力; 5、具有优秀的逻辑思维能力,能够适应AI软硬件结合技术的快速变化,能跟上主流AI优化技术的发展节奏; 6、有强烈技术热情和好奇心,自驱力和学习力强;具备良好的分析与解决问题的能力、沟通以及团队合作能力; 7、喜欢挑战性的技术研发工作,善于攻坚…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、参与视觉生成/多模态模型(包括文本、图像、视频生成等)在 GPU、ASIC、FPGA 等异构硬件上的推理/后训练加速开发与软硬件结合的性能优化工作,包括但不限于模型量化、attention优化、显存优化、编译优化、计算与通信优化、内存管理以及多卡或多设备的并行推理方案等; 2、在主流深度学习框架(如 PyTorch)基础上,基于GPU/xPU硬件特点,对关键算子进行软硬件结合优化,提升模型运行效率; 3、与硬件以及算法工程师紧密配合,共同优化整体推理速度与资源占用; 4、跟踪学术界与工业界前沿技术(如扩散模型优化、VAE并行优化、AI编解码、面向机器的编解码等),推动软硬件协同创新。
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Rust+
https://www.youtube.com/watch?v=BpPEoZW5IiY
In this comprehensive Rust course for beginners, you will learn about the core concepts of the language and underlying mechanisms in theory.
https://www.youtube.com/watch?v=lzKeecy4OmQ
Full Rust 101 Crash Course for beginners.
https://www.youtube.com/watch?v=rQ_J9WH6CGk
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
多线程+
https://liaoxuefeng.com/books/java/threading/basic/index.html
和单线程相比,多线程编程的特点在于:多线程经常需要读写共享数据,并且需要同步。
https://www.youtube.com/watch?v=_uQgGS_VIXM&list=PLsc-VaxfZl4do3Etp_xQ0aQBoC-x5BIgJ
https://www.youtube.com/watch?v=IEEhzQoKtQU
https://www.youtube.com/watch?v=mTGdtC9f4EU&list=PLL8woMHwr36EDxjUoCzboZjedsnhLP1j4
https://www.youtube.com/watch?v=TPVH_coGAQs&list=PLk6CEY9XxSIAeK-EAh3hB4fgNvYkYmghp
https://www.youtube.com/watch?v=xPqnoB2hjjA
This video is an introduction to multithreading in modern C++.
https://www.youtube.com/watch?v=YKBwKy5PrpQ
Rust threading is easy to implement and improves the efficiency of your applications on multi-core systems!
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
还有更多 •••
相关职位
社招5年以上技术类-开发
1、服务器软硬件一体系统设计与开发:基于产品需求分析,进行整体技术方案设计、开发和验证交付。 2、系统性能优化:对服务器进行软硬件一体性能优化和分析,实现软件系统稳定性/性能的提升。 3、系统测试与维护:对软硬件系统进行集成验证交付,对系统性问题进行分析定位,快速解决,保证满足系统性能、稳定性等要求。
更新于 2025-07-02深圳|杭州
社招5年以上云智能集团
1. 负责智能网卡的网卡驱动和RDMA驱动开发和实现; 2. 负责智能网卡在AI智算,存储等领域软硬件结合优化,创新研究; 3. 通过智能网卡的软硬件创新与优化,包括高性能网络协议的硬件卸载优化,帮助云产品基础设施持续提升技术竞争力。
更新于 2025-08-07深圳|杭州
社招5年以上云智能集团
1、设计并实现高效的AIGC工程/图像/视频处理软硬件一体化方案,参与媒体计算产品全生命周期开发。 2、负责系统性能调优,识别并解决关键瓶颈,提升稳定性与效率。 3、开发和维护底层驱动、基础软件及图像/视频SDK,确保硬件(ASIC/FPGA/GPU)与应用高效协同。
更新于 2025-09-08深圳