荣耀AI高性能计算工程师
社招全职研发类地点:北京状态:招聘
任职要求
1、高性能计算、并行计算、异构计算、性能优化; 2、扎实的计算机系统,组成原理和体系结构基础; 3、丰富的CPU、GPU、TPU、NPU、x86、ARM、DSP或者AI处理器调优经验; 4、CUDA, cuDNN, TensorRT, OpenBLAS, OpenMP, MKL, OpenCL或其他并行计算…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、负责深度学习框架的基础功能开发,实现各种计算操作,支持常见芯片平台; 2、使用各种高性能计算库提升深度学习框架计算速度; 3、紧跟业内最新技术,甄别技术成熟度。
包括英文材料
CUDA+
https://developer.nvidia.com/blog/even-easier-introduction-cuda/
This post is a super simple introduction to CUDA, the popular parallel computing platform and programming model from NVIDIA.
https://www.youtube.com/watch?v=86FAWCzIe_4
Lean how to program with Nvidia CUDA and leverage GPUs for high-performance computing and deep learning.
TensorRT+
https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/quick-start-guide.html
This TensorRT Quick Start Guide is a starting point for developers who want to try out the TensorRT SDK; specifically, it demonstrates how to quickly construct an application to run inference on a TensorRT engine.
OpenCL+
https://developer.nvidia.com/opencl
OpenCL™ (Open Computing Language) is a low-level API for heterogeneous computing that runs on CUDA-powered GPUs.
https://engineering.purdue.edu/~smidkiff/ece563/NVidiaGPUTeachingToolkit/Mod20OpenCL/3rd-Edition-AppendixA-intro-to-OpenCL.pdf
we will give a brief overview of OpenCL for CUDA programers.
[英文] Hands On OpenCL
https://handsonopencl.github.io/
An open source two-day lecture course for teaching and learning OpenCL
https://leonardoaraujosantos.gitbook.io/opencl/chapter1
Open Computing Language is a framework for writing programs that execute across heterogeneous platforms.
https://ulhpc-tutorials.readthedocs.io/en/latest/gpu/opencl/
OpenCL came as a standard for heterogeneous programming that enables a code to run in different platforms.
https://www.youtube.com/watch?v=4q9fPOI-x80
This presentation will show how to make use of the GPU from Java using OpenCL.
HPC+
https://www.ibm.com/think/topics/hpc
HPC is a technology that uses clusters of powerful processors that work in parallel to process massive, multidimensional data sets and solve complex problems at extremely high speeds.
还有更多 •••
相关职位
校招算法与软件
1. NPU Firmware/运行时库开发与交付; 2. NPU Firmware指令集设计与开发; 3. 参与硅前验证case开发,支持各种仿真平台算子与整网联调; 4. 参与硅后NPU Bringup; 5. 大模型在NPU芯片上量产部署。
北京
校招算法与软件
1. NPU Firmware/运行时库开发与交付; 2. NPU Firmware指令集设计与开发; 3. 参与硅前验证case开发,支持各种仿真平台算子与整网联调; 4. 参与硅后NPU Bringup; 5. 大模型在NPU芯片上量产部署。
杭州
校招算法与软件
1. NPU Firmware/运行时库开发与交付; 2. NPU Firmware指令集设计与开发; 3. 参与硅前验证case开发,支持各种仿真平台算子与整网联调; 4. 参与硅后NPU Bringup; 5. 大模型在NPU芯片上量产部署。
上海