小米数据研发工程师
社招全职3年以上A92288地点:北京状态:招聘
任职要求
1.计算机、数学相关专业本科及以上学历,3年以上互联网大数据研发工作经验; 2.熟悉sql,python,scala,hadoop,Hive,kafka,Spark,flink中的多项,了解数据湖等大数据处理工具和技术,有较强的调优能力; 3.熟练掌握批计算相关技术栈,了解流式计算相关技术; 4.深刻理解数据仓库的本质,能够独立调研业务,确定数据仓库建设的发展方向及制定落地的策略,可以设计高业务复杂度的数据仓库体系; 5.具有较好的沟通能力、学习能力和团队合作精神,乐于挑战自我,有进取心和求知欲。
工作职责
1.负责小米互联网电视和视频等业务数据仓库架构设计、标准化埋点、数据建模和ETL开发; 2.参与数据治理工作,提升数据易用性及数据质量,与数据平台团队紧密合作; 3.理解并合理抽象业务需求,解决服务的业务问题,与业务团队紧密合作; 4.跟踪业界先进的数据相关技术栈和解决方案。
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Scala+
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
相关职位
实习阿里云2026届
阿里云持续推进AI 技术深化战略布局, 围绕AI 和云计算的基础设施建设、AI基础模型平台、企业级AI应用方向构建核心场景。为此,我们正积极招募优秀人才: 如果你想参与阿里云大数据的采集、存储、处理,通过分布式大数据平台加工数据,支持业务管理决策; 如果你想参与阿里云大数据体系的模型设计、开发、维护,通过元数据、质量体系有效的管理和组织EB级的数据; 如果你想参与阿里云大数据产品的研发,发挥你的商业sense,通过数据分析和算法来洞察数据背后的机会,来探索大数据商业化; 如果你想接触世界领先的大数据处理与应用的技术和平台,获得大数据浪潮之巅的各类大牛的指导; 那就加入我们吧!
更新于 2025-06-17
实习菜鸟集团2026
1、负责菜鸟集团大数据的采集、存储、处理,通过分布式大数据平台加工数据,支持业务管理决策; 2、参与菜鸟集团大数据体系的模型设计、开发、维护,通过元数据、质量体系有效的管理和组织EB级的数据; 3、参与菜鸟集团大数据产品的研发,通过数据分析和算法洞察数据背后的商业机会点,探索大数据商业化。
更新于 2025-06-24
实习饿了么春季202
具体职责包括但不限于: 1、参与饿了么大数据的采集、存储、处理,通过分布式大数据平台加工数据,支持业务管理决策; 2、参与饿了么大数据体系的模型设计、开发、维护,通过元数据、质量体系有效的管理和组织EB级的数据; 3、参与饿了么大数据产品的研发,发挥你的商业sense,通过数据分析和算法来洞察数据背后的机会,来探索大数据商业化。
更新于 2025-02-27