蚂蚁金服蚂蚁数字科技-数科技术部-数据研发工程师
社招全职2年以上技术类-数据地点:杭州状态:招聘
任职要求
1. 2年以上工作经验,计算机、数学、统计学等相关专业本科以上学历 ,具有数据建模实践经验,熟悉分布式计算原理; 2. 熟悉业务建模、数据仓库建模、熟悉ETL设计开发,具备体系化的数据质量与数据治理相关经验。了解全数据研发生命周期,有过数据采集、数据清洗、数据建模、数据服务化中的一类或几类工作经验。 3. 熟悉Hadoop/Hive等大数据体系,深入了解起背后的实现原理;具有超大规模数据项目,有百万级TPS数据处理经验尤佳; 4. 熟悉实时计算技术体系包括数据采集、计算引擎Flink/Storm/Spark,对实时计算所涉及的事务、容错、可靠性有深入理解; 5. 业务理解力强,能主动抽象通用数据服务,具备良好的跨团队沟通能力; 6. 具备一定的Java、Python语言的开发能力,能够完成UDF开发和日常代码维护工作; 7. 有科技类(平台类)产品商业化经验背景优先。
工作职责
1. 负责蚂蚁数科风控业务数据资产建设,支撑ToC、ToB场景风控相关数据开发工作; 2. 负责核心业务数据链路建模与离线&实时数据开发,支撑所在业务线的数据架构规划以及实施落地;负责所在业务线的数据服务的稳定性、数据时效性、数据质量的能力保障和能力建设;负责所在业务线的数据资产、数据资源的治理和保障; 3. 与算法、产品、运营、后端深度协同,将业务需求快速落地到生产。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Apache Storm+
[英文] Tutorial
https://storm.apache.org/releases/2.6.0/Tutorial.html
In this tutorial, you'll learn how to create Storm topologies and deploy them to a Storm cluster.
https://www.baeldung.com/apache-storm
This tutorial will be an introduction to Apache Storm, a distributed real-time computation system.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
相关职位
社招3年以上技术类-开发
1. 负责蚂蚁数字科技数据架构&资产等相关数据产品平台的可行性评估、系统功能、稳定性、用户体验等系统设计及研发落地工作; 2. 规划并开发应用系统,负责核心模块以及复杂问题的攻关解决,带领研发工程师发现和解决存在的技术问题,保证系统的性能和稳定性、文档撰写、code reivew和单元测试,确保项目的进度和质量; 3. 紧跟技术发展趋势,将先进成熟的技术应用到产品中,通过技术助力业务发展。
更新于 2025-10-14
社招3年以上技术类-开发
1. 研发连接数据孤岛的隐私保护机器学习、隐私计算的核心技术和算法,解决隐私保护、高性能、拓展性等问题; 2. 面向金融等特定场景,研发隐私保护机器学习、统计分析等方案,支持传统AI模型、大模型及数据系统的隐私增强,构建兼顾模型性能、数据安全与实用性的AI和BI系统。
更新于 2025-08-29