美团分布式存储研发工程师
社招全职核心本地商业-基础研发平台地点:北京 | 上海状态:招聘
任职要求
1.熟悉 Spark、Presto、Hive、HBase 等主流大数据系统中一种或者几种的原理; 2.精通C/C++或Java,精通网络编程,多线程、高并发编程; 3.理解大型分布式存储系统工作原理,掌握生产环境中各种架构的元数据、数据的可用性、性能的分析方法 具备以下条件优先 熟悉常见分布式存储产品及其在大数据场景应用,包括包括但不限于HDFS、HBase、Ceph、CubeFS、JuiceFS、Lustre、Swift等优先; 熟悉开源数据湖产品,诸如Apache Hudi, Apache Iceberg等优先 熟悉分布式数据库开发经验者,如FoundationDB,TiDB,CockroachDB等优先
工作职责
为数仓和机器学习平台提供高可用、高可靠、超大规模的HDFS文件存储和Hive MetaStore元数据存储服务,解决海量文件带来的元数据瓶颈和成本问题,同时提供行业一流的可靠性可用性指标,实现数据跨机房容灾,实现数据分层流转,尽可能降低业务的容量管理成本
包括英文材料
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Presto+
[英文] What is Presto?
https://prestodb.io/what-is-presto/
https://www.tutorialspoint.com/apache_presto/index.htm
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
网络编程+
https://www.youtube.com/watch?v=2HrYIl6GpYg
I will make a simple HTTP web server with the C Programming Language.
https://www.youtube.com/watch?v=8z6okCgdREo
This tutorial is for Gophers who have written a command line or an API application, but have little to no experience in lower-level concepts like reading and writing to sockets, working with channels, and managing multiple goroutines.
https://www.youtube.com/watch?v=bdIiTxtMaKA&list=PL9IEJIKnBJjH_zM5LnovnoaKlXML5qh17
https://www.youtube.com/watch?v=bzja9fQWzdA
Implement the ubiquitous TCP protocol that underlies much of the traffic on the internet!
[英文] 📺Network Programming with Python Course (build a port scanner, mailing client, chat room, DDOS)
https://www.youtube.com/watch?v=FGdiSJakIS4
Learn network programming in Python by building four projects. You will learn to build a mailing client, a DDOS script, a port scanner, and a TCP Chat Room.
https://www.youtube.com/watch?v=gntyAFoZp-E
https://www.youtube.com/watch?v=JiuouCJQzSQ
Explore the fundamentals of networking in Rust by building a simple TCP server.
https://www.youtube.com/watch?v=JRTLSxGf_6w
https://www.youtube.com/watch?v=sFizpxHkIlI
In this video we'll cover SOCKET PROGRAMMING in JAVA.
https://www.youtube.com/watch?v=sXW_sNGvqcU
多线程+
https://liaoxuefeng.com/books/java/threading/basic/index.html
和单线程相比,多线程编程的特点在于:多线程经常需要读写共享数据,并且需要同步。
https://www.youtube.com/watch?v=_uQgGS_VIXM&list=PLsc-VaxfZl4do3Etp_xQ0aQBoC-x5BIgJ
https://www.youtube.com/watch?v=IEEhzQoKtQU
https://www.youtube.com/watch?v=mTGdtC9f4EU&list=PLL8woMHwr36EDxjUoCzboZjedsnhLP1j4
https://www.youtube.com/watch?v=TPVH_coGAQs&list=PLk6CEY9XxSIAeK-EAh3hB4fgNvYkYmghp
https://www.youtube.com/watch?v=xPqnoB2hjjA
This video is an introduction to multithreading in modern C++.
https://www.youtube.com/watch?v=YKBwKy5PrpQ
Rust threading is easy to implement and improves the efficiency of your applications on multi-core systems!
高并发+
https://www.baeldung.com/concurrency-principles-patterns
In this tutorial, we’ll discuss some of the design principles and patterns that have been established over time to build highly concurrent applications.
https://www.baeldung.com/java-concurrency
Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.
https://www.oreilly.com/library/view/concurrency-in-go/9781491941294/
You’ll understand how Go chooses to model concurrency, what issues arise from this model, and how you can compose primitives within this model to solve problems.
https://www.oreilly.com/library/view/modern-concurrency-in/9781098165406/
With this book, you'll explore the transformative world of Java 21's key feature: virtual threads.
https://www.youtube.com/watch?v=qyM8Pi1KiiM
https://www.youtube.com/watch?v=wEsPL50Uiyo
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Ceph+
https://docs.ceph.com/en/squid/start/beginners-guide/
The purpose of A Beginner’s Guide to Ceph is to make Ceph comprehensible.
https://www.youtube.com/watch?v=oEKJnHAfSiw
Swift+
[英文] A Swift Tour
https://docs.swift.org/swift-book/documentation/the-swift-programming-language/guidedtour/
Explore the features and syntax of Swift.
https://www.hackingwithswift.com/learn
Free Swift and iOS tutorials
https://www.youtube.com/watch?v=8Xg7E9shq0U
Learn the Swift programming language in this full tutorial for beginners.
Apache+
https://www.apache.org/
The Apache® Software Foundation (ASF) provides software for the public good, guided by community over code.
TiDB+
CockroachDB+
https://www.baeldung.com/cockroachdb-java
This tutorial is an introductory guide to using CockroachDB with Java.
https://www.cockroachlabs.com/resources/tutorial/
Tutorials in all programming languages.
相关职位
社招ACG
-设计、开发和优化公有云存储系统类产品,包括但不限于对象存储、分布式块存储服务、云消息队列服务、云Cache服务、关系型数据库、冷数据存储服务、数据传输服务等等 -开发和优化大规模高性能服务软件 -为百度开放云行业客户提供分布式存储技术和产品解决方案 -联动大数据、云计算、边缘云、视频云等多团队打造整体高性能解决方案
更新于 2025-06-10
社招ACG
-设计、开发和优化公有云存储系统类产品,包括但不限于对象存储、分布式块存储服务、云消息队列服务、云Cache服务、关系型数据库、冷数据存储服务、数据传输服务等等 -开发和优化大规模高性能服务软件 -为百度开放云行业客户提供分布式存储技术和产品解决方案 -联动大数据、云计算、边缘云、视频云等多团队打造整体高性能解决方案
更新于 2025-03-03