58同城hadoop方向大数据高级开发工程师(J20296)
社招全职技术类地点:北京状态:招聘
任职要求
任职资格: • 具备扎实的计算机理论基础, 对数据结构及算法有较强的功底 • 精通Java语言编程,具备优秀的系统Debug/Profiling能力和经验,熟悉常见的面向对象设计模式,具备优秀的系统架构设计能力 • 精通多线程编程,有分布式开发经验者优先 • 精通HDFS/Yarn的架构和原理,有大规模集群定制开发和优化经验 • 熟悉Hadoop周边Hbase、Hive、Spark等大数据开源组件 • 熟悉Kubernetes者优先,有参与开源社区的经验优先
工作职责
工作职责: 负责58 EB级海量数据存储和亿级任务调度的Hadoop及其周边生态的规划和建设,主导近万台集群规模的跨机房架构与落地,实现在离线混部、存算分离、云化等技术方向创新应用,打造稳定高效的新一代大数据平台。 • 负责跨机房、在离线混部、存算分离、云化等架构设计、技术选型、技术难点攻关 • 带领团队对Hadoop及其周边生态进行定制开发 • 负责大规模Hadoop集群的深度性能优化 • 参与社区互动,积极引进社区重大特性和改进并反哺社区提升影响力
包括英文材料
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
面向对象+
https://liaoxuefeng.com/books/java/oop/index.html
面向对象编程,英文是Object-Oriented Programming,简称OOP。
https://liaoxuefeng.com/books/python/oop/index.html
面向对象编程——Object Oriented Programming,简称OOP,是一种程序设计思想。
https://www.youtube.com/watch?v=SiBw7os-_zI
Learn the basics of object-oriented programming all in one video.
设计模式+
https://liaoxuefeng.com/books/java/design-patterns/index.html
设计模式,即Design Patterns,是指在软件设计中,被反复使用的一种代码设计经验。使用设计模式的目的是为了可重用代码,提高代码的可扩展性和可维护性。
[英文] Design Patterns
https://refactoring.guru/design-patterns
Design patterns are typical solutions to common problems in software design. Each pattern is like a blueprint that you can customize to solve a particular design problem in your code.
https://www.youtube.com/watch?v=NU_1StN5Tkk
Design Patterns tutorial explained in simple words using real-world examples.
系统设计+
https://roadmap.sh/system-design
Everything you need to know about designing large scale systems.
https://www.youtube.com/watch?v=F2FmTdLtb_4
This complete system design tutorial covers scalability, reliability, data handling, and high-level architecture with clear explanations, real-world examples, and practical strategies.
多线程+
https://liaoxuefeng.com/books/java/threading/basic/index.html
和单线程相比,多线程编程的特点在于:多线程经常需要读写共享数据,并且需要同步。
https://www.youtube.com/watch?v=_uQgGS_VIXM&list=PLsc-VaxfZl4do3Etp_xQ0aQBoC-x5BIgJ
https://www.youtube.com/watch?v=IEEhzQoKtQU
https://www.youtube.com/watch?v=mTGdtC9f4EU&list=PLL8woMHwr36EDxjUoCzboZjedsnhLP1j4
https://www.youtube.com/watch?v=TPVH_coGAQs&list=PLk6CEY9XxSIAeK-EAh3hB4fgNvYkYmghp
https://www.youtube.com/watch?v=xPqnoB2hjjA
This video is an introduction to multithreading in modern C++.
https://www.youtube.com/watch?v=YKBwKy5PrpQ
Rust threading is easy to implement and improves the efficiency of your applications on multi-core systems!
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Yarn+
[英文] Introduction
https://yarnpkg.com/getting-started
Yarn is an established open-source package manager used to manage dependencies in JavaScript projects.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
相关职位
社招1年以上技术类
岗位职责: 负责数据接入、数据清洗、数据转换,参与金融数仓搭建和数据报表开发; 参与风控特征开发和数据挖掘工作,支撑金融风控和营销应用,保障数据质量; 关注大数据类技术方向,进行持续跟踪和学习,以及技术攻关工作;
更新于 2025-08-15
社招3-5年技术
1. 负责滴滴网约车核心业务的数据仓库搭建及开发, 进行领域数仓建模并持续优化,持续提升数据效率; 2. 负责抽象核心业务流程,沉淀好用的数据架构、通用的分析框架和数据应用产品; 3. 负责数据开发流程及架构优化,不断完善数据治理体系,持续提升数仓建设的质量; 4. 探索新技术应用,实现技术变革升级
更新于 2025-09-19
社招3-5年后端开发
岗位职责: - 参与小红书商业化数据产品开发工作,业务方向包括但不限于销售业绩、客户分析、代理商盯盘等 - 与产品、运营、后端、测试、运维等多角色协同工作,包括业务理解,需求评审,方案沟通,系统维护等 - 设计并实现高效、可扩展的数据架构,确保系统能够支持复杂的业务逻辑和大数据量处理,持续提升交付质量和效率 - 负责复杂数据链路架构、稳定性、成本、性能等方面的优化工作,保障线上服务运行稳定,资源使用合理
更新于 2025-10-16