字节跳动平台架构研发技术专家/架构师-火山引擎
社招全职A37428A地点:上海状态:招聘
任职要求
1、计算机相关专业,本科及以上学历,熟练掌握Golang,Java或Python中的1-2种编程语言; 2、熟悉服务器端技术:RPC框架,MQ,缓存,线程池,数据分片等;了解微服务架构的技术挑战,并有对应的解决方案; 3、扎实的技术基础,熟悉性能、可用性、伸缩性、扩展性、安全性等开发和设计方案,熟悉行业常见的架构方案; 4、热爱编程,有较强的学习能力和抽象能力,有强烈的求知欲、好奇心和进取心,能及时关注和学习业界; 5、了解云计算或者有云平台管控开发,DevOps,稳定性系统架构研发经验优先。
工作职责
1、负责火山引擎的平台架构工程系统研发,包括需求分析、系统设计、编码实现、测试等工作; 2、负责火山引擎云服务依赖的公共组件和产品的研发,保障云服务的高效运行,并不断进行技术迭代和升级; 3、负责火山引擎稳定性平台的建设,包括监控、预警、故障排查和恢复等平台功能设计与研发; 4、参与火山引擎的技术方案讨论和决策,推动云服务架构的持续优化和改进。
包括英文材料
学历+
Go+
https://www.youtube.com/watch?v=8uiZC0l4Ajw
学习Golang的完整教程!从开始到结束不到一个小时,包括如何在Go中构建API的完整演示。没有多余的内容,只有你需要知道的知识。
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
RPC+
https://javaguide.cn/distributed-system/rpc/rpc-intro.html
为什么要 RPC ? 因为,两个不同的服务器上的服务提供的方法不在一个内存空间,所以,需要通过网络编程才能传递方法调用所需要的参数。并且,方法调用的结果也需要通过网络编程来接收。
https://www.youtube.com/watch?v=S2osKiqQG9s
This video is part of an 8-lecture series on distributed systems, given as part of the undergraduate computer science course at the University of Cambridge.
消息队列+
https://www.youtube.com/watch?v=xErwDaOc-Gs
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
微服务+
https://learn.microsoft.com/en-us/training/modules/dotnet-microservices/
Microservice applications are composed of small, independently versioned, and scalable customer-focused services that communicate with each other by using standard protocols and well-defined interfaces.
https://microservices.io/
Microservices - also known as the microservice architecture - is an architectural style that structures an application as a collection of two or more services.
https://spring.io/microservices
Building small, self-contained, ready to run applications can bring great flexibility and added resilience to your code.
https://www.ibm.com/think/topics/microservices
Microservices, or microservices architecture, is a cloud-native architectural approach in which a single application is composed of many loosely coupled and independently deployable smaller components or services.
https://www.youtube.com/watch?v=CqCDOosvZIk
https://www.youtube.com/watch?v=hmkF77F9TLw
Learn about software system design and microservices.
DevOps+
https://roadmap.sh/devops
Step by step guide for DevOps, SRE or any other Operations Role in 2025
https://zhuanlan.zhihu.com/p/562036793
DevOps中的Dev指的是Development(开发),Ops指的是Operations(运维),用一句话来说,DevOps就是打通开发运维的壁垒,实现开发运维一体化。
相关职位
社招U1064
1、负责火山引擎的平台架构工程系统研发,包括需求分析、系统设计、编码实现、测试等工作; 2、负责火山引擎云服务依赖的公共组件和产品的研发,保障云服务的高效运行,并不断进行技术迭代和升级; 3、负责火山引擎稳定性平台的建设,包括监控、预警、故障排查和恢复等平台功能设计与研发; 4、参与火山引擎的技术方案讨论和决策,推动云服务架构的持续优化和改进。
更新于 2022-10-18
社招F9175
1、负责火山引擎平台稳定性领域相关的AIOps场景(智能监控、变更风险识别和检测、事故/问题根因定位、告警聚合、架构治理、成本优化等)的业务与架构方案设计、研发及SRE Agent能力建设; 2、负责火山引擎的平台架构工程系统研发,包括需求分析、系统设计、编码实现、测试等工作; 3、负责火山引擎稳定性领域平台的建设,包括监控、预警、故障排查和恢复等平台功能设计与研发。
更新于 2022-10-19
社招A81609
1、负责火山引擎云原生容器平台产品的稳定性保障,通过平台建设/架构优化/组织提升等手段,不断提升云产品系统稳定性; 2、负责容器平台和大规模容器集群的稳定性保障,完成可靠性分析与优化;深入分析业务架构和系统运行时,持续识别稳定性薄弱环节,负责技术难点的攻坚,提升系统核心链路的整体稳定性; 3、参与火山引擎云原生容器平台产品的运维管控平台规划建设,设计实现相关自动化运维、分析诊断和保障体系,打造面向多地域超大规模集群的自动化运维和管控体系。
更新于 2025-06-10