鹰角网络大数据工程师(数仓方向)
校招全职程序技术类地点:上海状态:招聘
任职要求
1.大数据、软件工程、计算机科学、数学等相关专业本科以上学历。 2.熟练掌握SQL,具备编写和优化复杂SQL查询的能力。 3.对数据结构,算法和分布式计算有基础的理解,具备解决复杂数据问题的能力。 4.具备良好的沟通和协作能力,能够与不同背景的团队成员有效合作。 5.对大数据组件如Spark,Hive,Flink,Kafka原理有一定了解的优先。 6.有数仓,数据湖相关实习经验优先。
工作职责
1.负责数据收集、清洗、转换和处理(ETL)流程的设计,开发和实施。 2.参与数据体系的技术调研、验证。 3.监控数据质量,确保数据的准确性和完整性。 4.参与数据仓库的建设、模型的梳理和建立。
包括英文材料
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
学历+
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
相关职位
社招K9374
1、负责电商、生活服务、直播、抖音等业务的离线与实时数据仓库的构建; 2、负责数据模型的设计,ETL实施,ETL性能优化,ETL数据监控以及相关技术问题的解决; 3、负责指标体系建设与维护; 4、深入业务,理解并合理抽象业务需求,发挥数据价值,与业务团队紧密合作; 5、参与大数据应用规划,为数据产品、挖掘团队提供应用指导; 6、参与数据治理工作,提升数据易用性及数据质量。
更新于 2023-02-06
社招T0951
1、 负责电商、生活服务、直播、抖音等业务的离线与实时数据仓库的构建; 2、负责数据模型的设计,ETL实施,ETL性能优化,ETL数据监控以及相关技术问题的解决; 3、负责指标体系建设与维护; 4、深入业务,理解并合理抽象业务需求,发挥数据价值,与业务团队紧密合作; 5、参与大数据应用规划,为数据产品、挖掘团队提供应用指导; 6、参与数据治理工作,提升数据易用性及数据质量。
更新于 2022-06-23
社招自动驾驶板块
1. 数据指标体系搭建:深挖数据价值,构建和维护车端信号数据仓库体系和数据指标体系,为算法和数据闭环提供PB级共享平台和框架支持;负责核心数据指标体系(包括业务分类、生产状态、功能指标等)的搭建、监控与运营;快速输出并不断沉淀标准化的产品数据体系,让业务的数据化运营更加高效、便捷; 2. 数据治理:梳理上下游的数据资产,制定及推广数据标准(如研发规范、质量规范、保障规范)和治理流程,确保数据准 确性、完整性和一致性。 3. 数据管理:负责元数据管理、数据质量检查、数据分级管理等系统的设计、开发及应用,提升数据易用性、可用性及稳定性; 4. 业务团队数据需求的研发支撑:如日志埋点、车联网数据、内部与外部数据的采集、数据同步、数据清洗与标准化、数据模型设计、离线数据处理、实时数据处理、数据服务化、数据可视化等;
更新于 2025-07-08