
哈啰大数据开发工程师-数据平台-上海
社招全职技术地点:上海状态:招聘
任职要求
1、JAVA基础扎实,对JVM 原理有一定的了解,能够独立排查和解决问题; 2、掌握多线程及高性能的设计与编码及性能调优,有高并发应用开发经验; 3、熟悉分布式系统的设计和应用,熟悉分布式、缓存、消息、spring、ibatis等常见开源框架; 4、熟悉linux常用命令,熟悉 python/shell/javsscript 脚本语言,有sql优化经验; 5、具有创新思维,学习能力强,有一定的抗压能力,善于沟通和团队协作,乐于分享; 6、有从事分布式数据存储与计算平台应用开发经验,熟悉Hadoop生态相关技术并有相关实践经验着优先;
工作职责
1、负责离线、实时集群的稳定性和性能调优 2、调研业界先进的存储引擎并引入到现有平台,搭建、开发以及运维 3、规划大数据存储的组件构成和技术方向,负责常规的技术选型 4、针对业务需求对开源工程进行二次研发和Bug修复
包括英文材料
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
JVM+
https://www.freecodecamp.org/news/jvm-tutorial-java-virtual-machine-architecture-explained-for-beginners/
https://www.youtube.com/watch?v=e2zmmkc5xI0
多线程+
https://liaoxuefeng.com/books/java/threading/basic/index.html
和单线程相比,多线程编程的特点在于:多线程经常需要读写共享数据,并且需要同步。
https://www.youtube.com/watch?v=_uQgGS_VIXM&list=PLsc-VaxfZl4do3Etp_xQ0aQBoC-x5BIgJ
https://www.youtube.com/watch?v=IEEhzQoKtQU
https://www.youtube.com/watch?v=mTGdtC9f4EU&list=PLL8woMHwr36EDxjUoCzboZjedsnhLP1j4
https://www.youtube.com/watch?v=TPVH_coGAQs&list=PLk6CEY9XxSIAeK-EAh3hB4fgNvYkYmghp
https://www.youtube.com/watch?v=xPqnoB2hjjA
This video is an introduction to multithreading in modern C++.
https://www.youtube.com/watch?v=YKBwKy5PrpQ
Rust threading is easy to implement and improves the efficiency of your applications on multi-core systems!
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
高并发+
https://www.baeldung.com/concurrency-principles-patterns
In this tutorial, we’ll discuss some of the design principles and patterns that have been established over time to build highly concurrent applications.
https://www.baeldung.com/java-concurrency
Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.
https://www.oreilly.com/library/view/concurrency-in-go/9781491941294/
You’ll understand how Go chooses to model concurrency, what issues arise from this model, and how you can compose primitives within this model to solve problems.
https://www.oreilly.com/library/view/modern-concurrency-in/9781098165406/
With this book, you'll explore the transformative world of Java 21's key feature: virtual threads.
https://www.youtube.com/watch?v=qyM8Pi1KiiM
https://www.youtube.com/watch?v=wEsPL50Uiyo
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
Spring+
https://liaoxuefeng.com/books/java/spring/index.html
Spring是一个支持快速开发Java EE应用程序的框架。它提供了一系列底层容器和基础设施,并可以和大量常用的开源框架无缝集成,可以说是开发Java EE应用程序的必备。
https://spring.io/guides/gs/rest-service
https://spring.io/quickstart
Level up your Java code and explore what Spring can do for you.
iBATIS+
[英文] iBATIS Tutorial
https://www.tutorialspoint.com/ibatis/index.htm
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Bash+
[英文] The Bash Guide
https://guide.bash.academy/
A quality-driven guide through the shell's many features.
https://www.youtube.com/watch?v=tK9Oc6AEnR4
Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.
脚本+
[英文] Scripting language
https://en.wikipedia.org/wiki/Scripting_language
https://zhuanlan.zhihu.com/p/571097954
一个脚本通常是解释执行而非编译。脚本语言通常都有简单、易学、易用的特性,目的就是希望能让程序员快速完成程序的编写工作。
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
相关职位
社招
1. 全面负责AI算法数据生产工具的需求梳理、平台架构设计与开发,包括不限于音频、文本等数据类型; 2. 负责AI数据资产的数仓建设,包括标签体系设计、数据安全策略、数据查询获取等,和模型自动训练平台高效平稳对接; 3. 负责数据云服务系统的运维; 4. 与部门算法工程师,数据生产工程师配合,理解数据平台需求,完成项目升级开发; 5. 负责文档撰写、新开发人员的培训等团队建设工作。
更新于 2025-02-08
社招2年以上A259603A
1、负责字节跳动国际化广告数据平台/数据产品相关的质量保障工作; 2、负责相关过程中测试效率类工具的开发; 3、跟进线上数据问题,进行追踪、分析以及推动改进; 4、以较强的产品意识,在项目过程中提出并实施好的产品解决方案。
更新于 2025-06-18
社招2年以上技术类-质量保证
我们正在寻找一位热衷于数据算法质量的专业人士,加入我们的团队,共同提升数据算法的整体质量。以下是您将承担的主要职责: 1)质量保障与测试设计:根据数据算法的整体架构和业务需求,设计全面的质量保障和测试策略,不断优化我们的数据算法质量体系。 2)文档编写与经验沉淀:负责编写测试分析文档,确保数据采集、ETL、指标加工、数据回流、算法模型、产品应用等各环节的质量。 3)技术创新与问题解决:参与解决测试过程中的复杂技术问题,推动质量工具、测试技术及研发测试过程的创新,提升测试工作效率。 4)日常监控与应急响应:参与数据算法线上任务的日常监控和运维保障工作,及时发现并处理生产风险,消除或降低对业务的影响。 5)风险治理与能力提升:分析识别业务风险,沉淀风险治理能力,推动风险治理措施落地,提升整体质量保障。 6)质量和风险意识培养:培养团队成员的质量风险意识,定期进行质量培训和指导,提升团队整体质量水平。
更新于 2025-09-13