
文远知行高级大数据平台开发工程师
社招全职5年以上地点:广州状态:招聘
任职要求
负责为云上和自有数据中心K8s集群,分布式存储集群,以及其上运行的大数据/深度学习平台,定制、设计和开发各种系统模块,包括但不限于定制和开发大数据和深度学习平台的监控系统,任务追踪和性能分析系统,K8s集群各种底层插件,平台服务的CI/CD系统等。 工作上需要深入细致,对大数据和深度学习技术和平台有浓厚兴趣, 具备较强的团队管理能力和沟通能力,思维活跃,前瞻学习能力强。 职责: 1. 负责定制和开发云上以及自有数据中心上的监控报警系统、任务追踪和性能分析系统、平台服务的CI/CD系统等。 2. 负责定制和开发K8s集群各种底层插件,包括CNI,GPU,RDMA等。 3. 负责大数据和深度学习的平台各种基础系统模块的开发 要求: 1. 5年以上大数据系统开发,设计,架构经验。有大数据或深度学习任务开发的经验为佳。 2. 3年以上AWS或阿里云上大数据平台基础架构开发经验。 3. 熟悉大数据相关技术:Hadoop, Kafka, Hive, Zookeeper, Spark, Cassandra, MapReduce, 并阅读过相关源码。 4. 熟练理解和掌握Unix或linux系统。 5. 具备大数据云平台、计算存储平台、可视化开发平台经验,具备中大型成功项目经验优先;具备大规模分布式计算平台的使用和并行算法的开发经验。
工作职责
无
包括英文材料
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
CI+
https://www.ibm.com/cn-zh/think/topics/continuous-integration
持续集成 (CI) 是一种软件开发实践,开发人员在整个开发周期中会定期将新的代码和代码变更集成到中央代码存储库中。它是 DevOps 和敏捷方法的关键组成部分。
https://www.youtube.com/watch?v=42UP1fxi2SY
CD+
https://www.redhat.com/zh-cn/topics/devops/what-is-ci-cd
CI/CD 是持续集成和持续交付/部署的缩写,旨在简化并加快软件开发生命周期。
https://www.youtube.com/watch?v=R8_veQiYBjI&list=PLy7NrYWoggjzSIlwxeBbcgfAdYoxCIrM2
AWS+
https://aws.amazon.com/
Amazon Web Services offers reliable, scalable, and inexpensive cloud computing services. Free to join, pay only for what you use.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
ZooKeeper+
https://kubernetes.io/docs/tutorials/stateful-application/zookeeper/
This tutorial demonstrates running Apache Zookeeper on Kubernetes using StatefulSets, PodDisruptionBudgets, and PodAntiAffinity.
https://www.baeldung.com/java-zookeeper
Apache ZooKeeper is a distributed coordination service which eases the development of distributed applications.
[英文] Zookeeper Tutorial
https://www.tutorialspoint.com/zookeeper/index.htm
ZooKeeper is a distributed co-ordination service to manage large set of hosts.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Cassandra+
[英文] Learn Cassandra
https://teddyma.gitbooks.io/learncassandra/content/index.html
This book step-by-step guides developers to understand what Cassandra is, how Cassandra works and how to use the features and capabilities of Apache Cassandra 2.0.
https://www.freecodecamp.org/news/the-apache-cassandra-beginner-tutorial/
In this tutorial I will introduce you to Apache Cassandra, a distributed, horizontally scalable, open-source database.
https://www.youtube.com/watch?v=J-cSy5MeMOA
Apache Cassandra is an open source NoSQL distributed database.
MapReduce+
https://www.youtube.com/watch?v=bcjSe0xCHbE
https://www.youtube.com/watch?v=cHGaQz0E7AU
In this video I explain the basics of Map Reduce model, an important concept for any software engineer to be aware of.
Unix+
[英文] The UNIX® Standard
https://www.opengroup.org/membership/forums/platform/unix
https://www.youtube.com/watch?v=IrDUcdpPmdI
UNIX is an operating system which was first developed in the 1970s, and has been under constant development ever since.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
相关职位
社招技术团队开发
1. 大数据可视化配置平台开发: 负责大数据可视化配置平台的设计与开发,独立完成核心功能模块的实现。 针对大数据场景下的SQL查询进行优化,提升数据查询和处理的性能。 设计和实现高性能、高可用的Java后端服务,支撑可视化平台的稳定运行。 2. 技术研究与创新: 跟踪大数据和可视化技术的最新发展趋势,持续优化平台功能。 研究并引入新技术,提升系统的性能、可扩展性和用户体验。 解决技术难题,推动技术创新在实际项目中的应用。 3. 团队协作与指导: 与产品经理、前端开发、数据分析师等团队成员紧密合作,确保项目高效推进。 参与技术方案的讨论和设计,提出可行性建议并推动落地。 分享技术经验,帮助团队成员共同成长。
更新于 2025-03-11