携程资深数据开发工程师(MJ022748)
社招全职5年以上火车票业务项目/流程/数据地点:上海状态:招聘
任职要求
1、大学全日制本科及以上学历,计算机、数学相关专业,从事大数据领域至少5年以上; 2、具备数据仓库模型设计与ETL开发经,有算法特征数据经验更好; 3、具备熟练使用spark和flink开发批流数据处理任务的能力; 4、具有一定数据架构基础,熟悉大数据生态,包括但不限于 Hadoop/hive/spark/k…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、参与数据仓库的设计与搭建,重点支持算法团队对特征数据的使用; 2、参与公司数据应用(产品)的开发,推动业务部门的数据化运营; 3、探索包括大模型在内的新技术在数据应用和数仓领域的应用;
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
还有更多 •••
相关职位
社招2年以上A248307
1、遵循Robots协议,参与爬虫系统的建设与优化,满足各类业务数据需求; 2、负责分布式爬虫系统的建设,优化数据调度、抓取、解析、存储全栈流程; 3、帮助团队攻克各种爬虫技术难关,提升海量数据系统的抓取效果与性能。
更新于 2023-04-24北京
社招5-10年网易职能
1、负责网易集团财经数据中台的数仓规划与设计 2、完成相关原始数据采集、清洗、整理、去重和治理,保证数据及时性、完整性、一致性和准确性。 3、参与业务需求调研,根据业务需求设计数据仓库维度模型,并完成数据模型开发,沉淀数据指标。 4、持续改进优化ETL、分析处理等问题,对结构化的数据做数据分析; 5、对项目开发进度、代码质量进行管控、完成技术文档的沉淀。
更新于 2025-10-31杭州
社招3年以上技术
1.参与ERP数仓的开发、维护、优化及相关技术支持工作,以及财务&法务&采购等数据体系建设; 2.负责数据仓库ETL流程的优化及解决相关技术问题; 3.参与数据产品设计和评审,保障数据平台架构稳定; 4.为日常项目中需求提供数据支持,并且在一定程度上给予评估和建议。
更新于 2025-06-17北京