携程资深财务数据开发工程师/专家(MJ030431)
社招全职5年以上技术团队AI & BI地点:上海状态:招聘
任职要求
1.本科及以上学历,计算机相关专业,5年以上互联网数据开发经验。 2.深入理解数据仓库建模理论(如维度建模)、数仓分层设计理念,并具备丰富的落地实践经验。 3.精通Hive SQL编程,具备出色的SQL性能调优能力。 4.熟悉Hadoop生态系统,具有HDFS、MapReduce、Spark、Flink、Kafka等组件的使用和开发经验。 5.具备海量数据加工处理经验,了解不同技术组件在大数据读写、计算层面的优化原理。 6.对数据有高度的敏感性,能基于数据发现业务问题,并运用技术手段解决实际场景问题。 7.了解AI大模型RAG,MCP等常用技术,有AI编程及AI智能分析开发经验优先。 8.有互联网企业内部数据分析,尤其是财务和预算管报相关的数据分析开发经验优先。
工作职责
1.负责离线和实时数据仓库各层(如ODS、DWD、DWS、ADS)的模型设计、开发与优化,支撑数据分析和业务应用; 2.负责集团财务数据分析系统的数仓开发及报表开发; 3.独立完成复杂业务逻辑的数据ETL开发、任务调度与运维监控,保障数据加工流程的准确性和稳定性; 4.建立并监控数据质量规则,主动发现、跟踪并解决数据质量问题,确保数据的可靠性和可信度。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
MapReduce+
https://www.youtube.com/watch?v=bcjSe0xCHbE
https://www.youtube.com/watch?v=cHGaQz0E7AU
In this video I explain the basics of Map Reduce model, an important concept for any software engineer to be aware of.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
RAG+
https://www.youtube.com/watch?v=sVcwVQRHIc8
Learn how to implement RAG (Retrieval Augmented Generation) from scratch, straight from a LangChain software engineer.
MCP+
https://www.youtube.com/watch?v=eur8dUO9mvE
Unlock the secrets of MCP! 🚀 Dive into the world of Model Context Protocol and learn how to seamlessly connect AI agents to databases, APIs, and more. Roy Derks breaks down its components, from hosts to servers, and showcases real-world applications. Gain the knowledge to revolutionize your AI projects!
https://www.youtube.com/watch?v=L94WBLL0KjY
Let's talk about MCP or the Model Context Protocol.
数据分析+
[英文] Data Analyst Roadmap
https://roadmap.sh/data-analyst
Step by step guide to becoming an Data Analyst in 2025
相关职位
社招5年以上技术团队开发
1.能独立完成需求分析、技术方案设计及开发工作; 2.负责集团财务相关(预算/管报/财务分析等)系统的服务端代码编写、测试、和发布工作; 3.作为核心技术专家参与产品需求讨论、研发方案设计及代码实现等,深入理解业务需求,提供技术解决方案,推动项目的高效开发与交付; 4.负责系统的各项监控和排障,快速响应并高效解决线上问题,确保业务的平稳运行; 5.和团队一起解决系统中的关键问题和技术难题。
更新于 2025-09-08
社招6年以上信息技术类
1、负责电商财务域数据模型调研、设计、开发工作,与业务部门紧密配合,提供数据支持 2、负责电商财务域数据运营&治理工作,保障数据质量 3、参与电商财务域数据体系化建设,提升数据服务稳定性
更新于 2025-06-20
社招程序&技术类
1、参与米哈游职能线系统数据仓库的设计、开发、运维,构建信息化业务数据指标体系; 2、根据不同的业务场景,构建数据模型和业务指标体系,建立和完善日常业务数据报告体系,能够及时、准确、完整的披露业务执行情况; 3、主动理解并合理抽象业务需求,构造领域数据资产,与业务和研发团队紧密合作,提供数据分析和数据模型开发的技术解决方案; 4、基于Flink/Spark等计算引擎完成实时/离线数仓建设,进行ETL开发,保障数据质量。