蚂蚁金服蚂蚁集团-数据研发专家-杭州【数据平台】
社招全职3年以上技术类-数据地点:杭州状态:招聘
任职要求
1. 3年以上工作经验,计算机等相关专业本科以上学历 ,具备独立的模块开发能力; 2. 精通业务建模、数据仓库建模、精通ETL设计开发,有数据风险管理与治理相关经验; 3. 熟悉数据仓库领域知识和技能者优先,如Hadoop/Hive/Spark/Flink等; 4. 熟悉SQL执行原理,了解CBO/HBO,结合数据仓库设计可以快速的提供成本/时效/合规/架构最优的解决方案; 5. 具有跨部门的复杂数据项目或者技术领域的管理经验; 6. 热爱大数据,性格沉稳,有较好的语言表达能力,能自我驱动,有强烈的求知欲与进取心,有团队合作精神,敢于挑战,能在压力下成长。
工作职责
1)熟悉隐私安全法律法规,制定数据风险管理领域的解决方案。让蚂蚁业务数据安全、合规、高效流动.; 2)负责风险领域相关数据资产建设,数据化指引/落地风险管理治理工作; 3)能够主动推动安全合规技术以及产品平台的不断迭代优化,主导能力在业务侧的推广运营落地。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
相关职位
社招
1、负责核心业务域数据体系的规划和建设,通过数据产品和数据服务等方式,高效支撑业务场景的数据需求 2、深度理解业务,通过对业务策略和痛点的分析,制定系统性端到端的数据解决方案并落地 3、负责数据资产建设、数据质量与稳定性管理,构建共享融通的数据平台,让数据标准更规范、数据获取更高效 4、探索Data for AI以及AI数据产品,利用大模型提升全链路数据应用效率
更新于 2025-07-24
社招3年以上技术-开发
1、负责蚂蚁全站数据研发平台Dataphin实时相关体系建设,包括通用+智能(流批一体,Codeless)研发平台,质量保障平台,支撑智能化商业决策和运营,让数据快速释放价值; 2、负责蚂蚁全站数据研发平台Dataphin基础设施相关体系建设,保障全站用户稳定,高效,安全进行数据生产建设; 3、负责蚂蚁集团受控加工平台的应用架构设计和系统实施,通过体系化并具有前瞻性的能力建设,确保研发时即受控保障、事前可灰度观测、事中可观测预警、事后可应急快恢,使得数据三板斧和数据SLA在蚂蚁数据域全面落地。
更新于 2025-10-09
社招3年以上技术-基础平台
1、负责蚂蚁大模型应用平台的设计和开发,打造结合大模型的应用交互范式,支持蚂蚁核心场景的大模型应用搭建; 2、参与大模型应用链路升级,优化大模型微调、推理等关键环节,打造低成本+高性能+有智能感的解决方案; 3、基于大模型支持语音、数字人等交互方案升级,构建蚂蚁智能交互技术体系。
更新于 2025-09-26