米哈游资深后端开发工程师(计算平台)
社招全职程序&技术类地点:上海状态:招聘
任职要求
1、本科及以上学历,计算机、软件工程、数据科学等相关专业优先; 2、精通 Java/Scala,熟悉 Python/Golang 加分,具备良好的代码规范与工程化能力; 3、深入理解分布式计算与存储原理,熟悉 Flink、Spark、Hive、Doris/StarRocks 等至少一种主流引擎; 4、熟悉 MySQL、PostgreSQL 等 OLTP 数据库,了解数仓建模(ODS/DWD/DWS/ADS)、ETL/ELT 流程; 5、熟悉 Kubernetes、容器化部署、资源调度,具备大数据平台建设经验; 6、具备复杂问题定位与性能调优能力,能在高并发、高数据量场景下保障稳定性; 7、具备良好的跨团队沟通协作能力,能够支撑数据研发、业务分析、财务管报等多场景。
工作职责
1、负责企业数据计算平台的后端研发工作,支撑批处理、流处理、实时计算、交互式查询等多场景需求;设计并实现任务调度、资源管理、权限管控、计算作业编排等核心模块; 2、数据处理引擎集成并优化 Flink、Spark、Doris、StarRocks、Paimon 等计算与存储引擎;设计统一作业提交与执行框架,提升平台计算效率与稳定性。 3、提供对外 API/SDK,支持数据研发、指标体系、报表分析、机器学习等上层应用;打造自助式计算服务能力,降低业务方使用门槛; 4、负责大规模数据任务的性能优化与故障排查,确保 SLA;建设监控、告警、审计、任务追踪与成本管理体系; 5、平台架构演进,参与计算平台的架构规划工作,推动计算平台向云原生、湖仓一体化方向演进;调研新技术并推动落地,如 Kubernetes、存算分离、向量化计算、流批一体。
包括英文材料
学历+
数据科学+
https://roadmap.sh/ai-data-scientist
Step by step roadmap guide to becoming an AI and Data Scientist
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Go+
https://www.youtube.com/watch?v=8uiZC0l4Ajw
学习Golang的完整教程!从开始到结束不到一个小时,包括如何在Go中构建API的完整演示。没有多余的内容,只有你需要知道的知识。
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
MySQL+
https://juejin.cn/post/7190306988939542585
这是一篇 MySQL 通关一篇过硬核经验学习路线,包括数据库相关知识,SQL语句的使用,数据库约束,设计等。
[英文] MySQL Tutorial
https://www.mysqltutorial.org/
your go-to resource for mastering MySQL in a fast, easy, and enjoyable way.
https://www.youtube.com/watch?v=5OdVJbNCSso
MySQL SQL tutorial for beginners
https://www.youtube.com/watch?v=7S_tz1z_5bA
This beginner-friendly course teaches you SQL from scratch.
PostgreSQL+
[英文] PostgreSQL Tutorial
https://neon.com/postgresql/tutorial
This PostgreSQL tutorial helps you quickly understand PostgreSQL.
[英文] PostgreSQL Tutorial
https://www.pgtutorial.com/
This PostgreSQL tutorial will teach you about PostgreSQL from beginner to advanced.
https://www.youtube.com/watch?v=qw--VYLpxG4
It is the most advanced open source database system widely used to build back-end systems.
https://www.youtube.com/watch?v=SpfIwlAYaKk
Learn PostgreSQL, one of the world's most advanced and robust open-source relational database systems.
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
高并发+
https://www.baeldung.com/concurrency-principles-patterns
In this tutorial, we’ll discuss some of the design principles and patterns that have been established over time to build highly concurrent applications.
https://www.baeldung.com/java-concurrency
Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.
https://www.oreilly.com/library/view/concurrency-in-go/9781491941294/
You’ll understand how Go chooses to model concurrency, what issues arise from this model, and how you can compose primitives within this model to solve problems.
https://www.oreilly.com/library/view/modern-concurrency-in/9781098165406/
With this book, you'll explore the transformative world of Java 21's key feature: virtual threads.
https://www.youtube.com/watch?v=qyM8Pi1KiiM
https://www.youtube.com/watch?v=wEsPL50Uiyo
相关职位
社招5年以上诚云科技
1.负责通过IBN的方式设计并实现高效的物理网络设备架构线上定义和自动化交付能力,确保网络设备的高效稳定交付 2.构建并优化阿里云物理网络的自动化建设交付流程,通过AI提升网络设备的交付效率,确保交付SLA达标率 3.参与阿里云基础网络网络的物理、逻辑资源管理平台的设计开发,实现海量资源数据的高效、精准管理和运营
更新于 2025-09-12
社招K6077
1、参与服务端系统设计以及架构优化,保障系统的高可用、稳定性、安全性、高性能; 2、负责生活服务基础平台及组件建设,包括稳定性治理、预案平台、链路管理等平台以及计算、存储等中间件; 3、基于各业务场景,深入优化提供最佳服务治理实践,包含不局限于关键链路性能瓶颈分析、业务问题定位排障; 4、负责生活服务的资源成本管理,保障资源供应,持续推进经营成本优化; 5、负责生活服务重要活动的技术保障,做好容量评估以及各容灾场景下的预案演练,保证核心系统在活动期间的平稳运行。
更新于 2024-07-25
社招5年以上A129071
1、搭建电商包括商品,视频,直播切片等维度的复杂检索系统,包括索引构建框架,搜索框架,和稳定性保障框架等组件的设计,开发和维护工作; 2、负责搭建平台治理领域RAG框架,知识库和中间件架构的设计,开发和维护; 3、负责电商商品,视频,直播等多元消重系统的搭建,负责建设通用的海量数据聚类架构,持续提升系统吞吐量,性能和稳定性,保障电商核心特征的整体质量和效率。
更新于 2024-12-16