蚂蚁金服蚂蚁集团-湖仓计算引擎研发专家/高级专家-杭州/北京/上海
社招全职5年以上技术类-开发地点:北京 | 上海 | 杭州状态:招聘
包括英文材料
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
Celeborn+
https://celeborn.apache.org/docs/latest/
This documentation gives a quick start guide for running Spark/Flink/MapReduce with Apache Celeborn™.
Velox+
[英文] Velox in 10 minutes
https://facebookincubator.github.io/velox/velox-in-10-min.html
This is a quick introduction into Velox, the new C++ vectorized database acceleration library aimed at optimizing query engines and data processing systems.
相关职位
社招K4338
数据引擎-数据湖 团队,旨在打造业界领先的 EB 级超大规模数据湖,支持字节跳动众多核心业务线,如抖音、今日头条、电商。同时基于内部最佳实践,在火山引擎上打造一款云原生实时湖仓一体的 toB 产品——湖仓一体分析服务LAS(LakeHouse Analytics Service)。 1、打造业界领先的基于 HUDI的EB级数据湖,支撑字节跳动众多业务线(如抖音,今日头条,电商); 2、负责流批一体的实时数据湖存储系统的设计与研发,以及内核的极致优化; 3、与开源社区紧密合作,持续构建开源影响力,有机会成长为 HUDI Committer / PMC。
更新于 2022-08-17

社招2年以上技术类
1、【引擎研发】负责Spark、Presto、Hive 为基础的大数据查询引擎内核研发,跟进社区版本,改进性能,提升稳定性,研发新功能,修复内核BUG; 2、【业务支撑】负责排查、定位、解决生产集群问题,与运维同学一起维护生产集群的稳定性,协助业务方一起使用好大数据平台; 3、【平台规划】参与规划公司计算平台的技术演技,提升计算平台湖仓能力,基于云IAAS或者自建IAAS,打造高稳定性、高性能、低成本的计算平台。
更新于 2023-12-26
社招5年以上云智能集团
1. 负责SQL引擎核心优化,深入理解线上业务SQL使用方式,关注业界通用Benchmark,分析性能瓶颈并针对性改进等。 2. 负责SQL引擎增量计算能力演进,打造业界领先的增量计算产品 3. 负责MC智能数仓的开发和能力建设,提升MC整体的性价比和易用性 4. 支持SQL引擎线上业务,包括疑难问题答疑、线上稳定性改进、提升系统可观测性及用户使用体验等。
更新于 2025-09-15