蚂蚁金服蚂蚁集团-高级数据开发工程师-财保
社招全职3年以上技术类-数据地点:上海 | 杭州状态:招聘
任职要求
1.3年以上工作经验,计算机等相关专业本科以上学历 ,具有丰富的数据建模实践经验; 2.精通业务建模、数据仓库建模、精通ETL设计开发,具备体系化的数据质量与数据治理相关经验,有大型项目相关领域深入实践经验,能独立主导完成某一业务领域的整体模型设计,具备跨域的沟通协调能; 3.精通hadoop/yarn/hive等大数据体系,深入了解背后的实现原理,并能够调优;具有超大规模数据项目,有百万级TPS数据处理经验尤佳; 4.掌握实时计算技术体系包括数据采集、计算引擎storm/spark/flink,对实时计算所涉及的事务、容错、可靠性有深入理; 5.良好的思维逻辑性和语言表达能力,以及良好的项目沟通和协调能力; 6.具备一定的JAVA、Python语言的开发能力,具备机器学习算法能力尤佳 。
工作职责
1.负责蚂蚁财富、保险业务线数据体系和解决方案建设,赋能业务数字化运营,提升运营效率,保障数据的质量和稳定性; 2.负责业务领域核心数据体系的规划,以数据为核心生产要素制定数据解决方案,解决业务开展过程中遇到的痛点,包括但不限于用户标签体系、数据智能化和自动化体系和实时数据体系的建设; 3.负责建设高质量的领域数据资产,包括但不限于外部数据引入、数据标注、特征挖掘等,为业务智能化营销、大模型等智能化场景,提供必要的模型训练、迭代、部署等方面的支持,确保业务智能化升级目标能够顺利推进、落地; 4.负责主导或参与数据治理工作,实现持续、低成本的产出高质量的数据;建设数据内部共享融通的数据平台,保障数据的合规使用,避免数据泄漏及违规使用。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Yarn+
[英文] Introduction
https://yarnpkg.com/getting-started
Yarn is an established open-source package manager used to manage dependencies in JavaScript projects.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Apache Storm+
[英文] Tutorial
https://storm.apache.org/releases/2.6.0/Tutorial.html
In this tutorial, you'll learn how to create Storm topologies and deploy them to a Storm cluster.
https://www.baeldung.com/apache-storm
This tutorial will be an introduction to Apache Storm, a distributed real-time computation system.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
相关职位
社招3年以上技术类-安全
1.负责蚂蚁集团财保线数据安全及个人隐私保护安全技术体系建设及运营工作; 2.负责蚂蚁集团财保线生态数据安全风险治理的安全技术方案规划、设计和实施及运营; 3.负责蚂蚁集团财保线数据安全风险感知、审计及处置能力的开发建设,通过数据化、产品化的方法优化相关机制及流程。
更新于 2025-09-16
社招1-3年网易有道
1. 参与升学中心数据仓库设计与研发,完成数据建模的设计和开发以及数据监控,性能优化等相关技术工作 2. 结合升学中心业务特点,进行指标/标签体系的搭建 3. 参与数仓研发质量保障体系的完善和实施,打造稳定可靠的数据服务和保障体系 4. 调研和跟进大数据技术发展趋势进行相关数据方案的探索落地 5. 编写和维护数仓文档
更新于 2025-04-03
社招技术类
1、负责公司内视频云业务数据的开发和维护,为点直播业务与视频云研发团队提供快速、准确、灵活的数据仓库支持; 2、深入理解业务逻辑,完成数据模型设计及优化工作; 3、完成海量数据的获取、清洗、分类、整合等数据处理工作; 4、设计并实现对BI分析及报表展现、数据产品开发; 5、独立完成数据问题的排查与处理,解决数据质量与性能问题;
更新于 2025-02-13