小米高级算法工程师
社招全职5年以上A249787地点:北京状态:招聘
任职要求
1. 计算机、电子、数学、机器学习或者统计学相关专业,本科及以上学历;5年以上机器学习、深度学习、大模型建模经验。 2. 精通 Python 编程语言,熟悉常用的数据结构与算法,能够高效地实现复杂的 NLP 算法逻辑,具备良好的代码编写习惯和代码优化能力,确保算法代码的可读性、可维护性和高效性。 3. 熟练掌握至少一种深度学习框架(如 TensorFlow、PyTorch 等),深入理解神经网络的基本原理与架构,包括但不限于RNN、CNN、Transformer 等在 NLP 领域的应用,能够灵活运用这些框架搭建、训练和部署 NLP 模型,以应对舆情数据的复杂特征。 4. 了解常用大模型如 Qwen、GLM、Baichuan 等方法论,能够通过Prompt调优提升推理精度,并对大模型微调技术如 LoRA、P-Tuning 等有实践经验。 5. 对 NLP 有深入理解,掌握文本分类、情感分析、命名实体识别等常见任务的原理与方法,具备丰富的实践经验。 6. 对瞬时大流量场景,如发布会,拥有分类、情感分析、总结摘要等算法处理经验。 7. 熟悉机器学习算法原理,包括监督学习、无监督学习、强化学习等,能够运用机器学习算法解决舆情数据中的分类、聚类、预测、总结等问题,为舆情趋势分析、热点话题挖掘等提供有力支持。 8. 了解舆情数据的特点和业务需求,对舆情监测、舆情分析、舆情预警等工作有一定的认识,能够将 NLP 技术与舆情业务场景紧密结合,为客户提供贴合实际需求的舆情解决方案。
工作职责
1. 负责舆情监测系统中 NLP 相关任务的算法建模与优化,包括文本分类、情感分析、实体识别、语义理解、视频内容理解等模块,确保能够快速准确地从海量文本数据中提取有价值的信息,为舆情预警、趋势分析等应用提供坚实技术支撑。 2. 深入研究舆情数据特点,探索适合的 NLP 模型架构与算法策略,针对舆情文本的复杂性(如网络用语、多领域话题交织等),不断改进现有模型,提高模型泛化能力,使其能够应对多样化的舆情场景和数据变化。 3. 进行标注标准制定,协同标注人员构建高质量数据集,为算法训练提供基础数据,同时基于反馈数据持续优化算法效果,以数据驱动算法迭代。 4. 跟踪行业前沿技术动态与研究成果,如将大语言模型,多模态模型等应用于舆情分析场景。 5. 协助开发团队将算法成果工程化落地,确保模型在实际舆情监测系统中的高效稳定运行,参与算法性能的测试与评估工作,及时解决上线过程中出现的技术难题,保障系统稳定性。
包括英文材料
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
学历+
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
NLP+
https://www.youtube.com/watch?v=fNxaJsNG3-s&list=PLQY2H8rRoyvzDbLUZkbudP-MFQZwNmU4S
Welcome to Zero to Hero for Natural Language Processing using TensorFlow!
https://www.youtube.com/watch?v=R-AG4-qZs1A&list=PLeo1K3hjS3uuvuAXhYjV2lMEShq2UYSwX
Natural Language Processing tutorial for beginners series in Python.
https://www.youtube.com/watch?v=rmVRLeJRkl4&list=PLoROMvodv4rMFqRtEuo6SGjY4XbRIVRd4
The foundations of the effective modern methods for deep learning applied to NLP.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
RNN+
https://d2l.ai/chapter_recurrent-neural-networks/rnn.html
A neural network that uses recurrent computation for hidden states is called a recurrent neural network (RNN).
https://www.deeplearningbook.org/contents/rnn.html
Recurrent neural networks, or RNNs (Rumelhart et al., 1986a), are a family of neural networks for processing sequential data.
https://www.ibm.com/think/topics/recurrent-neural-networks
A recurrent neural network or RNN is a deep neural network trained on sequential or time series data to create a machine learning (ML) model that can make sequential predictions or conclusions based on sequential inputs.
CNN+
https://learnopencv.com/understanding-convolutional-neural-networks-cnn/
Convolutional Neural Network (CNN) forms the basis of computer vision and image processing.
[英文] CNN Explainer
https://poloclub.github.io/cnn-explainer/
Learn Convolutional Neural Network (CNN) in your browser!
https://www.deeplearningbook.org/contents/convnets.html
Convolutional networks(LeCun, 1989), also known as convolutional neuralnetworks, or CNNs, are a specialized kind of neural network for processing data.
https://www.youtube.com/watch?v=2xqkSUhmmXU
MIT Introduction to Deep Learning 6.S191: Lecture 3 Convolutional Neural Networks for Computer Vision
Transformer+
https://huggingface.co/learn/llm-course/en/chapter1/4
Breaking down how Large Language Models work, visualizing how data flows through.
https://poloclub.github.io/transformer-explainer/
An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.
https://www.youtube.com/watch?v=wjZofJX0v4M
Breaking down how Large Language Models work, visualizing how data flows through.
Prompt+
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/introduction-prompt-design
A prompt is a natural language request submitted to a language model to receive a response back.
https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/prompt-engineering
These techniques aren't recommended for reasoning models like gpt-5 and o-series models.
https://www.youtube.com/watch?v=LWiMwhDZ9as
Learn and master the fundamentals of Prompt Engineering and LLMs with this 5-HOUR Prompt Engineering Crash Course!
强化学习+
https://cloud.google.com/discover/what-is-reinforcement-learning?hl=en
Reinforcement learning (RL) is a type of machine learning where an "agent" learns optimal behavior through interaction with its environment.
https://huggingface.co/learn/deep-rl-course/unit0/introduction
This course will teach you about Deep Reinforcement Learning from beginner to expert. It’s completely free and open-source!
https://www.kaggle.com/learn/intro-to-game-ai-and-reinforcement-learning
Build your own video game bots, using classic and cutting-edge algorithms.
相关职位
社招5年以上A118837
1. 通过对海量车辆运行日志的深度解析,提取关键信息,包括车辆故障码、传感器数据、驾驶行为数据等,为故障诊断提供数据支持。 2. 运用数据挖掘技术,如聚类分析、关联规则挖掘等,发现车辆日志中的潜在模式和异常行为,提前预警潜在故障风险,为预防性维护提供依据。 3. 构建车辆故障诊断方案检索系统,基于车辆故障特征和历史维修记录,快速检索出与当前故障相似的诊断方案和维修案例,为诊断人员提供参考。 4. 运用大语言模型、机器学习算法,优化存量远程诊断案例方案推荐,针对存量方案库生成新的方案,提高诊断效率和准确性。
更新于 2025-05-26
社招5年以上A35523
1、负责端侧CV算法的研发和落地,包括但不限于目标检测、识别、跟踪等算法; 2、负责算法工程化,包括模型工程化和优化等工作; 3、负责端侧算法框架设计开发; 4、可能也会参与一部分多模态大模型相关的工作;
更新于 2025-04-02