小红书分布式向量数据库研发工程师/专家
社招全职3年以上后端开发地点:上海 | 北京 | 杭州状态:招聘
任职要求
基本要求 3年及以上数据库系统研发经验,具备扎实的分布式数据库架构设计、开发、生产支持经验 精通C/C++/Rust/Golang/Java 中一门或多门语言,熟悉常用算法与数据结构,具备扎实的操作系统、存储、数据库基础知识 具备良好的学习能力与自驱力,有技术有强烈的好奇心,具备良好的问题分析与解决能力 加分项 有开源社区参与经验,知名开源项目核心贡献者优先 熟悉或深度参与过以下开源项目者优先,Milvus/Qdrant/Rocksdb/HBase/Cassandra/Redis/MongoDB 等 有大规模分布式系统生产支持、性能调优、稳定性保障经验者优先
工作职责
参与公司向量数据库的研发工作,设计研发新一代分布式向量数据库系统,支持AI、社交、搜索、推荐、广告、电商等核心业务场景 负责产品内核设计、开发测试、性能调优、管控与文档建设等全生命周期开发管理工作,针对业务发展需求进行系统演进,提供高可用、高可靠、高性价比的向量服务 学习和吸纳业界优秀的技术和理论成果,积极探索和拓展新的产品能力,持续提升产品的技术竞争力与服务水平
包括英文材料
系统设计+
https://roadmap.sh/system-design
Everything you need to know about designing large scale systems.
https://www.youtube.com/watch?v=F2FmTdLtb_4
This complete system design tutorial covers scalability, reliability, data handling, and high-level architecture with clear explanations, real-world examples, and practical strategies.
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Rust+
https://www.youtube.com/watch?v=BpPEoZW5IiY
In this comprehensive Rust course for beginners, you will learn about the core concepts of the language and underlying mechanisms in theory.
https://www.youtube.com/watch?v=lzKeecy4OmQ
Full Rust 101 Crash Course for beginners.
https://www.youtube.com/watch?v=rQ_J9WH6CGk
Go+
https://www.youtube.com/watch?v=8uiZC0l4Ajw
学习Golang的完整教程!从开始到结束不到一个小时,包括如何在Go中构建API的完整演示。没有多余的内容,只有你需要知道的知识。
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Cassandra+
[英文] Learn Cassandra
https://teddyma.gitbooks.io/learncassandra/content/index.html
This book step-by-step guides developers to understand what Cassandra is, how Cassandra works and how to use the features and capabilities of Apache Cassandra 2.0.
https://www.freecodecamp.org/news/the-apache-cassandra-beginner-tutorial/
In this tutorial I will introduce you to Apache Cassandra, a distributed, horizontally scalable, open-source database.
https://www.youtube.com/watch?v=J-cSy5MeMOA
Apache Cassandra is an open source NoSQL distributed database.
Redis+
[英文] Developer Hub
https://redis.io/dev/
Get all the tutorials, learning paths, and more you need to start building—fast.
https://www.runoob.com/redis/redis-tutorial.html
REmote DIctionary Server(Redis) 是一个由 Salvatore Sanfilippo 写的 key-value 存储系统,是跨平台的非关系型数据库。
https://www.youtube.com/watch?v=jgpVdJB2sKQ
In this video I will be covering Redis in depth from how to install it, what commands you can use, all the way to how to use it in a real world project.
MongoDB+
https://learnxinyminutes.com/mongodb/
MongoDB is a NoSQL document database for high volume data storage.
https://studio3t.com/academy/#courses
The fastest way to learn MongoDB
https://www.youtube.com/watch?v=c2M-rlkkT5o
This video will give you and introduction to MongoDB in 1 Hour. Afterwards I recommend exploring aggregation, replication, and sharding.
https://www.youtube.com/watch?v=ExcRbA7fy_A&list=PL4cUxeGkcC9h77dJ-QJlwGlZlTd4ecZOA
You'll learn how to use MongoDB (a NoSQL database) from scratch. You'll also learn how to integrate it into a simple Node.js API.
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
Milvus+
[英文] Tutorials Overview
https://milvus.io/docs/tutorials-overview.md
This page provides a list of tutorials for you to interact with Milvus.
https://www.baeldung.com/milvus-tutorial-intro
In this tutorial, we’ll explore Milvus, a highly scalable open-source vector database.
https://www.youtube.com/watch?v=7ejr_ZzU9jw
Discover the power of Milvus, an open-source vector database revolutionizing AI applications.
https://www.youtube.com/watch?v=Yhv19le0sBw
Vector databases have been trending recently as they power modern search, recommendations, and AI-driven applications.
RocksDB+
https://rocksdb.org/docs/getting-started.html
The RocksDB library provides a persistent key value store.
相关职位
社招A92695A
团队介绍:基础架构数据库 CDI(Common Data Infra)团队支撑字节核心业务板块的数据基础设施建设,深度参与业务存储/数据架构的演进工作。团队负责面向 Base 领域的分布式数据库 FxDB 和向量数据库等产品的研发和迭代,深耕前沿数据库技术,助力业务提升核心技术品质。 1、负责向量索引构建、向量检索等核心算法的实现和性能优化; 2、负责高性能向量数据库的架构设计、功能迭代、以及产品化; 3、针对泛 AI 产品业务场景下的特定需求,提出解决方案并落地。
更新于 2024-07-30
社招A232774B
团队介绍:基础架构数据库 CDI(Common Data Infra)团队支撑字节核心业务板块的数据基础设施建设,深度参与业务存储/数据架构的演进工作。团队负责面向 Base 领域的分布式数据库 FxDB 和向量数据库等产品的研发和迭代,深耕前沿数据库技术,助力业务提升核心技术品质。 1、负责向量索引构建、向量检索等核心算法的实现和性能优化; 2、负责高性能向量数据库的架构设计、功能迭代、以及产品化; 3、针对泛 AI 产品业务场景下的特定需求,提出解决方案并落地。
更新于 2024-07-30
社招2年以上A162987
1、负责存储相关组件的设计和开发,服务于大模型推理和训练场景,包括模型分发加载、KV Cache存储和优化,数据IO性能优化,提高核心性能指标; 2、负责设计和实现面向面向海量数据模型训练和视频转码的分布式缓存文件系统,使用内存、SSD、HDD以及云端对象存储等介质进行数据的持久化存储和管理,均衡的优化存储性能与成本; 3、负责设计和实现面向多模态内容理解和检索元数据管理服务。
更新于 2025-05-09