阿里巴巴智能引擎-多模态模型优化专家-GPU/NPU/XPU
社招全职3年以上技术类-开发地点:杭州状态:招聘
任职要求
1. 计算机、电子工程或相关专业本科及以上学历,对计算机体系结构有深刻理解。 2. 有GPU/NPU/XPU高性能计算优化经验,精通至少一种异构计算平台及编程模型(如CUDA, ROCm, OpenCL, SYCL, CANN等),有针对AMD、华为、Intel等特…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
我们是阿里巴巴离线推理团队,负责大规模多模态数据处理pipeline,支持非LLM的模型结构定制、异构卡型适配和推理加速。与千问、百炼等团队合作,以软件和SAAS的方式,为淘天、AIDC、高德、优酷、闲鱼等多个集团业务部门提供强有力的技术支撑和底层服务能力。 1. 主导或核心参与基于编译技术的算子平台化优化方案,利用Triton、TileLang、JAX/MLIR等技术栈,支持模型结构的定制与优化,缩短新卡型或新模型的适配周期。 2. 使用专业的Profiling工具,对模型在异构硬件上的端到端性能进行分析,精准定位Kernel执行、数据搬运、通信等环节的瓶颈,并提出体系化的优化方案。 3. 针对特定异构芯片(如华为昇腾、AMD MI系列等),深入分析其指令集、存储层级(HBM/Cache)和计算单元特性,使用原生语言手写和优化核心算子,实现极致性能。
包括英文材料
学历+
CUDA+
https://developer.nvidia.com/blog/even-easier-introduction-cuda/
This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA.
https://www.youtube.com/watch?v=86FAWCzIe_4
Lean how to program with Nvidia CUDA and leverage GPUs for high-performance computing and deep learning.
OpenCL+
https://developer.nvidia.com/opencl
OpenCL™ (Open Computing Language) is a low-level API for heterogeneous computing that runs on CUDA-powered GPUs.
https://engineering.purdue.edu/~smidkiff/ece563/NVidiaGPUTeachingToolkit/Mod20OpenCL/3rd-Edition-AppendixA-intro-to-OpenCL.pdf
we will give a brief overview of OpenCL for CUDA programers.
[英文] Hands On OpenCL
https://handsonopencl.github.io/
An open source two-day lecture course for teaching and learning OpenCL
https://leonardoaraujosantos.gitbook.io/opencl/chapter1
Open Computing Language is a framework for writing programs that execute across heterogeneous platforms.
https://ulhpc-tutorials.readthedocs.io/en/latest/gpu/opencl/
OpenCL came as a standard for heterogeneous programming that enables a code to run in different platforms.
https://www.youtube.com/watch?v=4q9fPOI-x80
This presentation will show how to make use of the GPU from Java using OpenCL.
还有更多 •••