小米视觉图像算法工程师-实习
实习兼职地点:北京状态:招聘
任职要求
1. 计算机科学、信息工程、电子工程、机器人学等专业,有C++/python/Java开发经验; 2. 熟练掌握深度学习相关理论,包括BP算法、神经网络、CNN、RNN、LSTM、Transformer等, 对大模型结构(Bert,GPT等)有应用经验的优先; 3. 掌握图像处理基本知识,例如图像滤波,压缩和缩放等算法,…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1. 负责手机上图像算法的开发,例如人脸识别,文本检测OCR,视觉SLAM 2. 在手机上优化和部署算法,包括模型的压缩、量化和加速,并使用手机上的各种计算单元包括CPU、GPU和NPU; 3. 负责训练数据的预处理工作,包括如何收集图像数据,标注数据,数据增强和数据的清洗工作; 4. 参与创新技术的预研和产品化工作,紧跟业界领先的算法,设计更加优秀的算法, 并撰写相关论文,专利。
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
LSTM+
https://colah.github.io/posts/2015-08-Understanding-LSTMs/
Humans don’t start their thinking from scratch every second.
https://d2l.ai/chapter_recurrent-modern/lstm.html
The term “long short-term memory” comes from the following intuition.
https://developer.nvidia.com/discover/lstm
A Long short-term memory (LSTM) is a type of Recurrent Neural Network specially designed to prevent the neural network output for a given input from either decaying or exploding as it cycles through the feedback loops.
https://www.youtube.com/watch?v=YCzL96nL7j0
Basic recurrent neural networks are great, because they can handle different amounts of sequential data, but even relatively small sequences of data can make them difficult to train.
Transformer+
https://huggingface.co/learn/llm-course/en/chapter1/4
Breaking down how Large Language Models work, visualizing how data flows through.
https://poloclub.github.io/transformer-explainer/
An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.
https://www.youtube.com/watch?v=wjZofJX0v4M
Breaking down how Large Language Models work, visualizing how data flows through.
还有更多 •••
相关职位
实习
1、参与图像、视频生成相关领域研发工作,探索视觉生成领域前沿方向 2、参与图像画质增强、视频可控生成、多模态视觉生成、视觉生成领域强化学习等方向研究 3、分析和解决算法产品化过程中出现的效果、性能等问题 4、参与学术研究,产出影响行业的科研成果
更新于 2025-05-23武汉
实习
1、参与图像、视频生成相关领域研发工作,探索视觉生成领域前沿方向 2、参与图像画质增强、视频可控生成、多模态视觉生成、视觉生成领域强化学习等方向研究 3、分析和解决算法产品化过程中出现的效果、性能等问题 4、参与学术研究,产出影响行业的科研成果
更新于 2025-02-19武汉
实习
1、参与图像、视频生成相关领域研发工作,探索视觉生成领域前沿方向 2、参与图像生成与编辑、视频可控生成、多模态视觉生成、视觉生成领域强化学习等方向研究 3、分析和解决算法产品化过程中出现的效果、性能等问题 4、参与学术研究,产出影响行业的科研成果
更新于 2025-09-01武汉
实习
1.前沿算法研发 •主导计算机视觉与AIGC核心算法研发(检测/分割/生成/多模态等),推动超分、修复、美化等技术在业务场景落地,实现效果与效率双优化。 •探索Stable Diffusion等生成式模型的应用创新,结合业务需求优化图像生成、智能编辑(如文本驱动编辑、语义修复)等关键技术。 2.工程化落地 •完成算法从原型到产品的全链路开发,解决模型压缩(量化/剪枝)、推理加速(TensorRT/MNN部署)、跨平台适配等工程挑战。 •构建高精度、低延迟的CV pipeline,覆盖图像矫正、去噪、OCR等实际需求。 3.技术前瞻性研究 •跟踪CVPR/ICML等顶会技术动态,针对性研发Diffusion Models、Vision Transformer等前沿模型,建立技术壁垒。
更新于 2025-08-21北京