百度分布式存储研发实习工程师(J71230)
实习兼职ACG地点:北京 | 上海 | 深圳状态:招聘
任职要求
-本科及以上学历在读生,计算机相关专业,可实习3个月以上 -熟练掌握至少一种下列编程语言:C/C++、Java、python、go等 -熟悉常用的数据结构、算法设计,熟悉存储设备、文件系统、Linux操作系统原理 -对分布式存储系统有浓厚的兴趣,并且善于学习、乐于去挑战在云计算环境下超大规模云存储系统面临的各种挑战 -富有激情和创造力,学习能力强,良好的团队合作能力 -具有以下条件者优先 -有开放云产品研发或使用经验,包括Amazon AWS、Azure、GCE、aliyun -熟悉分布式系统理论,有大规模分布式系统设计架构经验(包括Hadoop/HDFS/Openstack/Ceph/mongodb/dynamodb/aws-s3/GFS/BigTable等) -熟悉数据库技术,有数据库内核或者nosql数据库的开发经验 -熟悉操作系统内核,特别是存储设备、文件系统等部分
工作职责
-设计、开发和优化公有云存储系统类产品,包括但不限于对象/块/文件/表格存储服务 -开发和优化大规模高性能服务软件 -为百度开放云行业客户提供分布式存储技术和产品解决方案 -联动大数据、云计算、边缘云、视频云等多团队打造整体高性能解决方案
包括英文材料
学历+
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Go+
https://www.youtube.com/watch?v=8uiZC0l4Ajw
学习Golang的完整教程!从开始到结束不到一个小时,包括如何在Go中构建API的完整演示。没有多余的内容,只有你需要知道的知识。
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
AWS+
https://aws.amazon.com/
Amazon Web Services offers reliable, scalable, and inexpensive cloud computing services. Free to join, pay only for what you use.
Azure+
https://azure.microsoft.com/
Invent with purpose, realize cost savings, and make your organization more efficient with Microsoft Azure’s open and flexible cloud computing platform.
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HDFS+
https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
https://www.ibm.com/cn-zh/think/topics/hdfs
Hadoop 分布式文件系统 (HDFS) 是一种管理大型数据集的文件系统,可在商用硬件上运行。
Ceph+
https://docs.ceph.com/en/squid/start/beginners-guide/
The purpose of A Beginner’s Guide to Ceph is to make Ceph comprehensible.
https://www.youtube.com/watch?v=oEKJnHAfSiw
DynamoDB+
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStartedDynamoDB.html
You’ll learn how to connect to, create, and manage DynamoDB tables in the following sections.
https://dynobase.dev/dynamodb-tutorials/
Collection of tutorials and articles to help you solve problems, make decisions and understand DynamoDB.
https://www.hellointerview.com/learn/system-design/deep-dives/dynamodb
DynamoDB is a fully-managed, highly scalable, key-value service provided by AWS.
https://www.scylladb.com/learn/dynamodb/introduction-to-dynamodb/
Amazon DynamoDB is a cloud-native NoSQL primarily key-value database.
https://www.youtube.com/watch?v=2k2GINpO308
In this video, I explain to you the core concepts of dynamodb and walk you through the console.
内核+
https://www.youtube.com/watch?v=C43VxGZ_ugU
I rummage around the Linux kernel source and try to understand what makes computers do what they do.
https://www.youtube.com/watch?v=HNIg3TXfdX8&list=PLrGN1Qi7t67V-9uXzj4VSQCffntfvn42v
Learn how to develop your very own kernel from scratch in this programming series!
https://www.youtube.com/watch?v=JDfo2Lc7iLU
Denshi goes over a simple explanation of what computer kernels are and how they work, alonside what makes the Linux kernel any special.
NoSQL+
https://nosql-database.org/
Everything about NoSQL Systems – Types, Benefits, and Real-World Uses
https://piaosanlang.gitbooks.io/mongodb/content/section1.1.html
NoSQL(NoSQL = Not Only SQL ),即"不仅仅是SQL",指的是非关系型的数据库。是对不同于传统的关系型数据库管理系统的统称。
https://www.youtube.com/watch?v=0buKQHokLK8
NoSQL databases can operate in multiple modes: as key-value store, document store or wide column store.
相关职位
实习D8030
1、承担公司统一化底层分布式存储平台的设计和研发工作,这是类似阿里盘古的纯自研系统; 2、面向整个快手产品和业务、容器云、CDN以及其他技术团队交付和提供EB级别、强一致、超高性能、超低延迟、高可用、高可靠、功能完备、运维友好、业界领先的分布式文件存储、块存储和OSS存储、分布式Log等存储服务; 3、针对量级不断发展的快手图片、短/长视频、压缩加密、算法处理、安全等业务域,设计和研发、优化并解决相关领域内的存储侧痛点,根据业务需求和特点进行存储优化,综合提升存储性能、成本和稳定性; 4、持续推动公司存储技术栈的优化和演进,推动存储新趋势和技术在快手落地,不断进行架构升级和演进。
更新于 2025-06-09