希音资深后台开发(大数据组件开发)
社招全职6年以上信息技术类地点:深圳状态:招聘
任职要求
1.至少6年以上相关经验,有扎实的计算机编程基础,精通java/scala,熟悉jvm的原理和调优。 2.精通spark/hive/flink组件原理和内核优化,有超大规模数据计算的架构设计和优化经验。 3.掌握大数据行业趋势,熟悉Kubernetes/Docker,有组件容器化相关经验。 4.具备较强的问题解决能力,能独立分析和攻坚复杂的技术难题。 5.有公有云使用经验者优先。 6.有良好的服务意识、沟通能力和团队协作精神。 ——Kafka方向: 岗位职责: 1、设计和部署高可用、可扩展的Kafka集群,满足业务需求和性能目标; 2、管理和优化Kafka集群的性能和稳定性,日常监控Kafka集群的健康状况,确保数据流畅处理,并及时响应和处理线上问题; 3、建设kafka运维和开发自动化平台,开发和维护Kafka sdk,支持多种编程语言; 4、制定kafka使用规范,提供Kafka技术培训和支持给内部开发团队。 任职要求: 1、至少5年以上生产大规模使用Kafka的经验,包括Kafka集群的设计、部署和运维; 2、具有在大数据环境下工作的经验,熟悉Hadoop、Flink等生态系统者优先; 3、扎实的编程技能,包括Java、Scala或Python,深入了解分布式系统原理和容错机制; 4、优秀的问题解决能力,能够在快节奏的环境中独立工作和做出决策; 5、熟悉各大云服务(如AWS、Azure或GCP)、pulsar者优先; 6、出色的沟通技巧和团队合作精神。 城市:南京、深圳(可选)
工作职责
——Spark方向: 1.大数据新技术规划、调研、选型及推广落地。 2.负责大数据组件内核开发优化,推进组件容器化,进行组件二次开发与适配等工作。 3.日常负责大数据框架组件的性能优化,稳定性保障,异常监控及线上问题对接解决。 4.参与平台功能研发,提供业务系统化的解决方案。
包括英文材料
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
JVM+
https://www.freecodecamp.org/news/jvm-tutorial-java-virtual-machine-architecture-explained-for-beginners/
https://www.youtube.com/watch?v=e2zmmkc5xI0
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
内核+
https://www.youtube.com/watch?v=C43VxGZ_ugU
I rummage around the Linux kernel source and try to understand what makes computers do what they do.
https://www.youtube.com/watch?v=HNIg3TXfdX8&list=PLrGN1Qi7t67V-9uXzj4VSQCffntfvn42v
Learn how to develop your very own kernel from scratch in this programming series!
https://www.youtube.com/watch?v=JDfo2Lc7iLU
Denshi goes over a simple explanation of what computer kernels are and how they work, alonside what makes the Linux kernel any special.
系统设计+
https://roadmap.sh/system-design
Everything you need to know about designing large scale systems.
https://www.youtube.com/watch?v=F2FmTdLtb_4
This complete system design tutorial covers scalability, reliability, data handling, and high-level architecture with clear explanations, real-world examples, and practical strategies.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
Docker+
https://www.youtube.com/watch?v=GFgJkfScVNU
Master Docker in one course; learn about images and containers on Docker Hub, running multiple containers with Docker Compose, automating workflows with Docker Compose Watch, and much more. 🐳
https://www.youtube.com/watch?v=kTp5xUtcalw
Learn how to use Docker and Kubernetes in this complete hand-on course for beginners.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
高可用+
https://redis.io/blog/high-availability-architecture/
A high available architecture is when there are a number of different components, modules, or services that work together to maintain optimal performance, irrespective of peak-time loads.
https://www.ibm.com/think/topics/high-availability
High availability (HA) is a term that refers to a system’s ability to be accessible and reliable close to 100% of the time.
SDK+
https://www.ibm.com/think/topics/api-vs-sdk
Learn about software development kits (SDKs) and application programming interfaces (APIs) and how they improve both software development cycles and the end-user experience (UX).
https://www.redhat.com/zh-cn/topics/cloud-native-apps/what-is-SDK
软件开发套件(SDK)是通常由硬件平台、操作系统(OS)或编程语言的制造商提供的一套工具。
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
AWS+
https://aws.amazon.com/
Amazon Web Services offers reliable, scalable, and inexpensive cloud computing services. Free to join, pay only for what you use.
Azure+
https://azure.microsoft.com/
Invent with purpose, realize cost savings, and make your organization more efficient with Microsoft Azure’s open and flexible cloud computing platform.
Pulsar+
https://pulsar.apache.org/docs/next/functions-develop-tutorial/
Write a function for word count.
https://www.baeldung.com/apache-pulsar
Apache Pulsar is a distributed open source Publication/Subscription based messaging system developed at Yahoo.
https://www.youtube.com/watch?v=TKs5T6N78Tc
Discover the seven key features of Apache Pulsar that make it perfect for providing a centralized messaging & data streaming service for an Enterprise.
相关职位
社招3年以上信息技术类
1.大数据基础平台、应用平台功能设计和开发。 2.负责大数据平台及组件的调研选型,部署,日常监控及问题解决。 3.参与海量数据处理方案设计,提供业务系统技术支撑。
更新于 2025-09-24
社招3年以上信息技术类
1.大数据基础平台、应用平台功能设计和开发。 2.负责大数据平台及组件的调研选型,部署,日常监控及问题解决。 3.参与海量数据处理方案设计,提供业务系统技术支撑。
更新于 2025-04-16
社招3年以上信息技术类
1.大数据基础平台、应用平台功能设计和开发。 2.负责大数据平台及组件的调研选型,部署,日常监控及问题解决。 3.参与海量数据处理方案设计,提供业务系统技术支撑。
更新于 2025-04-24