小米图像算法工程师实习生(AIGC方向)
实习兼职地点:北京状态:招聘
任职要求
1、学历:本科及以上学历,专业为模式识别、机器学习、计算机、自动化、数学等相关专业。 2、数学基础:具备扎实的数学基础,熟悉模式识别、机器学习理论基础,掌握领域内常用算法。 3、编程能力:熟练掌握C、C++、Python中的一种,熟悉linux下开发。 4、逻辑思维能力:具备优秀的逻辑思维能力,有强烈的上进心和求知欲,善于接受及学习新技术。 5、团队合作精神:良好的团队合作精神,较强的沟通能力。 6、经验:有相关图像/多模态算法开发经验,并至少对以下领域之一有实际的经验积累。 a. 对深度学习算法有一定的理解,包括不限于文本分类、语义理解、图像/视频理解、检测、分割、人脸文本生成等。 b. 熟悉常见的机器学习和深度学习算法, 熟练掌握至少一种深度学习框架,如Pytorch/TensorRT/Tensorflow/MNN/NCNN等,并能够熟练掌握及理解CNN/RNN/Transformer等常见网络模型。 7、学术前沿关注度:对学术前沿有浓厚兴趣,时刻跟进技术前沿,并善于利用各类技术解决复杂的实际问题。
工作职责
1.前沿算法研发 •主导计算机视觉与AIGC核心算法研发(检测/分割/生成/多模态等),推动超分、修复、美化等技术在业务场景落地,实现效果与效率双优化。 •探索Stable Diffusion等生成式模型的应用创新,结合业务需求优化图像生成、智能编辑(如文本驱动编辑、语义修复)等关键技术。 2.工程化落地 •完成算法从原型到产品的全链路开发,解决模型压缩(量化/剪枝)、推理加速(TensorRT/MNN部署)、跨平台适配等工程挑战。 •构建高精度、低延迟的CV pipeline,覆盖图像矫正、去噪、OCR等实际需求。 3.技术前瞻性研究 •跟踪CVPR/ICML等顶会技术动态,针对性研发Diffusion Models、Vision Transformer等前沿模型,建立技术壁垒。
包括英文材料
学历+
模式识别+
https://www.mathworks.com/discovery/pattern-recognition.html
Pattern recognition is the process of classifying input data into objects, classes, or categories using computer algorithms based on key features or regularities.
https://www.microsoft.com/en-us/research/wp-content/uploads/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf
Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science.
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
TensorRT+
https://docs.nvidia.com/deeplearning/tensorrt/latest/getting-started/quick-start-guide.html
This TensorRT Quick Start Guide is a starting point for developers who want to try out the TensorRT SDK; specifically, it demonstrates how to quickly construct an application to run inference on a TensorRT engine.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
MNN+
https://github.com/alibaba/MNN?tab=readme-ov-file#intro
MNN is a highly efficient and lightweight deep learning framework.
CNN+
https://learnopencv.com/understanding-convolutional-neural-networks-cnn/
Convolutional Neural Network (CNN) forms the basis of computer vision and image processing.
[英文] CNN Explainer
https://poloclub.github.io/cnn-explainer/
Learn Convolutional Neural Network (CNN) in your browser!
https://www.deeplearningbook.org/contents/convnets.html
Convolutional networks(LeCun, 1989), also known as convolutional neuralnetworks, or CNNs, are a specialized kind of neural network for processing data.
https://www.youtube.com/watch?v=2xqkSUhmmXU
MIT Introduction to Deep Learning 6.S191: Lecture 3 Convolutional Neural Networks for Computer Vision
RNN+
https://d2l.ai/chapter_recurrent-neural-networks/rnn.html
A neural network that uses recurrent computation for hidden states is called a recurrent neural network (RNN).
https://www.deeplearningbook.org/contents/rnn.html
Recurrent neural networks, or RNNs (Rumelhart et al., 1986a), are a family of neural networks for processing sequential data.
https://www.ibm.com/think/topics/recurrent-neural-networks
A recurrent neural network or RNN is a deep neural network trained on sequential or time series data to create a machine learning (ML) model that can make sequential predictions or conclusions based on sequential inputs.
Transformer+
https://huggingface.co/learn/llm-course/en/chapter1/4
Breaking down how Large Language Models work, visualizing how data flows through.
https://poloclub.github.io/transformer-explainer/
An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.
https://www.youtube.com/watch?v=wjZofJX0v4M
Breaking down how Large Language Models work, visualizing how data flows through.
相关职位
实习淘天集团研究型实
我们是阿里妈妈智能创作与AI应用团队, 长期从事利用CV NLP等多模态和多媒体技术进行内容创作、内容理解的算法工作,团队耕耘技术多年,在电商创意素材生成领域 有广泛的业界影响力,研发出阿里妈妈创意中心、万相实验室等产品以及阿里妈妈智能图片制作(Auto Poster)、阿里妈妈视频生成(AtomoVideo)等技术,研究成果发表在 CVPR、ICCV、AAAI、ACM MM、WWW、ACL 等学术顶会。 我们诚挚欢迎你加入团队,工作内容为下列之一: 1. 需要1年Diffusion Models扩散图像生成经验(强相关)。1年图像领域相关经验。 2. 需要在广告 或者 电商 场景的图像算法应用经验。 3. 需要在图像生成方向有顶会论文,CVPR,ECCV,NIPS,MM。
更新于 2025-09-08
实习虎鲸文娱2026
优酷拥有海量的图像/视频数据,强大的计算能力和巨大的市场空间。我们需要你具有计算机视觉相关基础知识和视觉分析、诊断、搜索、合成等方面的实践经验。我们期待聪明、乐观、皮实、自省、追求卓越和自我驱动的优秀人士加入优酷,共同开创视觉技术的新格局。 具体职责包括但不限于: 1、负责图像/视频的分析、诊断、合成、编辑等方面的算法研究,多模态大语言模型的有监督微调等; 2、负责图像/视频/3D相关算法的前沿技术探索,包括图像/视频/3D生成和可控编辑领域的联合创新。
更新于 2025-05-06