米哈游【提前批】数据开发工程师
校招全职程序&技术类地点:上海状态:招聘
任职要求
1、2026届应届生,获得本科及以上学历,计算机、软件工程等相关专业优先; 2、熟练掌握 Python / Java / Scala 至少一门编程语言,了解主流的开发框架和工具; 3、熟悉开源大数据相关技术栈,包括 Hadoop / Hive / Spark / Kafka / Flink / OLAP引擎等; 4、扎实的SQL能力,熟悉复杂查询和性能优化; 5、强烈的责任心和团队协作精神,优秀的问题分析和解决能力,对数据技术有热情,持续学习新技术。 加分项 参与过AI项目实践,熟悉AI工程领域,例如Prompt Engineering、RAG、向量检索等。
工作职责
1、参与企业级数据平台建设,负责数据采集、存储、计算、可视化全链路开发; 2、深入业务场景,参与数据模型设计与优化,提升数据质量和计算效率; 3、参与AI数据产品设计与开发。
包括英文材料
学历+
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
开发框架+
[英文] Understanding Modern Development Frameworks: A Guide for Developers and Technical Decision-makers
https://www.freecodecamp.org/news/understanding-modern-development-frameworks-guide-for-devs/
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Prompt+
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/introduction-prompt-design
A prompt is a natural language request submitted to a language model to receive a response back.
https://learn.microsoft.com/en-us/azure/ai-foundry/openai/concepts/prompt-engineering
These techniques aren't recommended for reasoning models like gpt-5 and o-series models.
https://www.youtube.com/watch?v=LWiMwhDZ9as
Learn and master the fundamentals of Prompt Engineering and LLMs with this 5-HOUR Prompt Engineering Crash Course!
RAG+
https://www.youtube.com/watch?v=sVcwVQRHIc8
Learn how to implement RAG (Retrieval Augmented Generation) from scratch, straight from a LangChain software engineer.
相关职位
校招程序&技术类
1、负责Web组件、功能模块、产品页面或游戏研发效能工具的前端开发; 2、与产品经理、UX设计人员共同参与产品设计评审; 3、按照项目计划与设计稿,设计技术方案并实现,按时完成需求; 4、持续优化用户体验,并保证兼容性、执行效率和可扩展性。
校招程序&技术类
1.涉及协同文档富文本编辑器、协同在线表格、多维表、思维导图、白板,或者Web组件、功能模块和产品页面的前端开发; 2.与产品经理、UX设计人员共同参与产品设计评审; 3.按照项目计划与设计稿,设计技术方案并实现,按时完成需求; 4.持续优化用户体验,并保证兼容性、执行效率和可扩展性。
校招程序&技术类
1、负责平台业务的后端研发工作; 2、参与核心系统和基础服务的方案设计和代码开发,并且持续调优; 3、和团队一起攻克各种高并发、容灾、系统解耦等方面的技术难关,保证系统的高可用性、高可靠性和高扩展性; 4、参与新技术预研和方案选型,参与关键技术点的攻坚工作。