阿里云阿里云智能-大数据研发专家-SQL 引擎
社招全职3年以上云智能集团地点:北京状态:招聘
任职要求
1. 有扎实的Java/C++后端开发基础(熟悉IO、多线程等基础框架,熟悉分布式、缓存等机制); 2. 有SQL引擎相关开发经验(对SQL Compiler、Optimizer、Runtime各模块有基本了解,了解Codegen、Vectorize、CBO等技术,有Hive、Spark、或者OLAP引擎实际开发经验优先); 3. 具备良好的面向对象编程经验,深入理解Object-Oriented思想,具有很强的系统分析设计能力,熟悉常用设计模式; 4. 热爱技术,工作认真、严谨,具备较强的学习能力和责任心,能自我激励,善于沟通与团队协作,能够快速适应变化; 5. 有云上大数据产品开发经验(湖仓一体、离在线一体等)相关技术背景者优先。
工作职责
1. 负责MCQA2.0性能优化和稳定性提升,提升MaxCompute在中小数据规模下的竞争力。 2. 负责SQL引擎核心优化,深入理解线上业务SQL使用方式,关注业界通用Benchmark,分析性能瓶颈并针对性改进等。 3. 负责SQL引擎控制链路的架构演进和持续优化(全链路 Cache、异步化改造等),提升性能及稳定性。 4. 探索SQL引擎新硬件(GPU、FPGA等)适配,为MaxCompute提供更高性价比硬件资源。 5. 支持SQL引擎线上业务,包括疑难问题答疑、线上稳定性改进、提升系统可观测性及用户使用体验等。
包括英文材料
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
后端开发+
https://www.youtube.com/watch?v=tN6oJu2DqCM&list=PLWKjhJtqVAbn21gs5UnLhCQ82f923WCgM
Learn what technologies you should learn first to become a back end web developer.
多线程+
https://liaoxuefeng.com/books/java/threading/basic/index.html
和单线程相比,多线程编程的特点在于:多线程经常需要读写共享数据,并且需要同步。
https://www.youtube.com/watch?v=_uQgGS_VIXM&list=PLsc-VaxfZl4do3Etp_xQ0aQBoC-x5BIgJ
https://www.youtube.com/watch?v=IEEhzQoKtQU
https://www.youtube.com/watch?v=mTGdtC9f4EU&list=PLL8woMHwr36EDxjUoCzboZjedsnhLP1j4
https://www.youtube.com/watch?v=TPVH_coGAQs&list=PLk6CEY9XxSIAeK-EAh3hB4fgNvYkYmghp
https://www.youtube.com/watch?v=xPqnoB2hjjA
This video is an introduction to multithreading in modern C++.
https://www.youtube.com/watch?v=YKBwKy5PrpQ
Rust threading is easy to implement and improves the efficiency of your applications on multi-core systems!
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
OLAP+
https://www.youtube.com/watch?v=iw-5kFzIdgY
OLAP (for online analytical processing) is software for performing multidimensional analysis at high speeds on large volumes of data from a data warehouse, data mart, or some other unified, centralized data store.
设计模式+
https://liaoxuefeng.com/books/java/design-patterns/index.html
设计模式,即Design Patterns,是指在软件设计中,被反复使用的一种代码设计经验。使用设计模式的目的是为了可重用代码,提高代码的可扩展性和可维护性。
[英文] Design Patterns
https://refactoring.guru/design-patterns
Design patterns are typical solutions to common problems in software design. Each pattern is like a blueprint that you can customize to solve a particular design problem in your code.
https://www.youtube.com/watch?v=NU_1StN5Tkk
Design Patterns tutorial explained in simple words using real-world examples.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
相关职位
社招5年以上云智能集团
1. 负责SQL引擎核心优化,深入理解线上业务SQL使用方式,关注业界通用Benchmark,分析性能瓶颈并针对性改进等。 2. 负责SQL引擎增量计算能力演进,打造业界领先的增量计算产品 3. 负责MC智能数仓的开发和能力建设,提升MC整体的性价比和易用性 4. 支持SQL引擎线上业务,包括疑难问题答疑、线上稳定性改进、提升系统可观测性及用户使用体验等。
更新于 2025-09-15

社招3年以上数据部
1、包括实时数据采集计算、流式湖仓建设、现有平台架构调优、业务场景建模、数据产能提质增效。 2、应用场景:BI、OLAP、用户画像、群体分析、数据驱动等。 3、解决方案:基于Kafka、Flink的数据采集、流转、实时ETL平台建设; 基于Kyuubi、Trino、Doris、Hive、Hudi、Ceph、S3的流式湖仓建模、应用、维护; 4、Redis、MongoDB、ClickHouse、TiDB、Mysql等数据存储技术的应用。 技术能力: 1、SQL专家,精通复杂查询、性能优化、不同引擎SQL特性(必需); 2、精通数仓与数据湖架构设计及业务建模(必需); 3、熟悉ClickHouse、Doris、Trino、Kyuubi、Spark、Hive、S3等多类存储、计算引擎(必需)、掌握Python语言(必需)、掌握Kafka、Flink流计算平台、掌握Java/Scala语言
更新于 2025-10-16
社招D7195
1、参与快手EB级大数据平台计算引擎相关系统的研发与优化工作,解决实际业务需求与性能问题; 2、接受大数据平台系统设计与实现复杂度的挑战,分析和发现系统的优化点,负责推动系统的合理性、可靠性、可用性的提升; 3、和开源社区保持交流,从社区引入对公司业务场景有帮助的特性与系统,或将内部研发的功能贡献到社区。
更新于 2025-03-07