
同程旅行大数据开发工程师
社招全职地点:苏州状态:招聘
任职要求
任职要求: 1.计算或相关专业本科及以上学历 2.具备扎实的计算机理论基础, 对数据结构及算法有较强的功底,具备技术极客精神 3.精通Java语言编程,具备优秀的系统Debug/Profiling能力和经验 4.熟悉常见的面向对象设计模式,具备优秀的系统架构设计能力 5.熟悉Hadoop/HBase/Flink/Spark/Hive/Presto等开源大数据技术,在开源社区活跃者优先 6.具备实际的大数据业务开发经验以及良好的项目沟通和协调能力 7.具备批处理组件二次开发者优先
工作职责
团队内80%+都活跃在开源社区,有多名Committer. 欢迎对大数据底层技术有兴趣的小伙伴,一起挑战自我!(非数据仓库方向) 工作base可选:苏州/北京/成都 岗位描述: 基于hadoop/flink/spark/hive/cloud native等开源技术 1. 负责大数据集群规划、运维工作;负责大数据集群技术问题攻关,集群调优,源码解读,Bug fix等; 2. 负责大数据公共组件、中间件的开发工作; 3. 负责存储组件、批处理、流计算、OLAP、ML/DL,通过技术和业务场景的紧密结合,让数据发挥最大业务价值 4. 支撑数据中台建设;支撑业务结合需求设计高扩展、高性能、高可用的大数据业务系统;
包括英文材料
学历+
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
面向对象+
https://liaoxuefeng.com/books/java/oop/index.html
面向对象编程,英文是Object-Oriented Programming,简称OOP。
https://liaoxuefeng.com/books/python/oop/index.html
面向对象编程——Object Oriented Programming,简称OOP,是一种程序设计思想。
https://www.youtube.com/watch?v=SiBw7os-_zI
Learn the basics of object-oriented programming all in one video.
设计模式+
https://liaoxuefeng.com/books/java/design-patterns/index.html
设计模式,即Design Patterns,是指在软件设计中,被反复使用的一种代码设计经验。使用设计模式的目的是为了可重用代码,提高代码的可扩展性和可维护性。
[英文] Design Patterns
https://refactoring.guru/design-patterns
Design patterns are typical solutions to common problems in software design. Each pattern is like a blueprint that you can customize to solve a particular design problem in your code.
https://www.youtube.com/watch?v=NU_1StN5Tkk
Design Patterns tutorial explained in simple words using real-world examples.
系统设计+
https://roadmap.sh/system-design
Everything you need to know about designing large scale systems.
https://www.youtube.com/watch?v=F2FmTdLtb_4
This complete system design tutorial covers scalability, reliability, data handling, and high-level architecture with clear explanations, real-world examples, and practical strategies.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Presto+
[英文] What is Presto?
https://prestodb.io/what-is-presto/
https://www.tutorialspoint.com/apache_presto/index.htm
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
相关职位
社招网易数智
1、负责网易大数据平台的Iceberg等大数据组件迭代研发。 2、负责Iceberg等技术在业务上的实践落地以及问题分析诊断。 3、 参与Hive等组件在大数据元数据服务方面的稳定性建设以及问题诊断。
更新于 2025-04-17
社招A166444A
1、为大规模推荐系统设计和实现合理的离线/实时数据架构; 2、设计和实现灵活可扩展、稳定、高性能的存储系统和计算模型; 3、生产系统的Trouble-shoting,设计和实现必要的机制和工具保障生产系统整体运行的稳定性; 4、打造业界领先的离在线存储、批式流式计算框架等分布式系统,为海量数据和大规模业务系统提供可靠的基础设施。
更新于 2025-02-20

社招5年以上技术
1、负责哈啰街猫业务基础数据的建设,包括基础数据模型建立和维护,报表的开发,业务系统的数据开发等; 2、理解哈啰街猫投喂、电商等业务,根据业务需求建立用户画像体系和标签体系,支持推荐和用户运营; 3、参与数据产品及应用的研发工作,挖掘数据业务价值,助力数据化运营;
更新于 2025-02-12