小鹏汽车大数据高级开发工程师
社招全职3年以上地点:广州状态:招聘
任职要求
1、本科或硕士学历,计算机相关专业;3年以上数据平台开发相关经验,有商业分析实施经验优先。 2、深入了解大数据常用技术栈, 如Hadoop、Hive、HBase、Flink、Kafka等 3、精通SQL、python, 熟悉数仓ETL开发流程,UDF开发, FlinkSQL任务开发等。 4、有大型数据仓库,业务指标体系构建项目经验。 5、熟悉阿里云大数据解决方案(Dataworks、Maxcompute、Hologres等)优先。 6、具备良好的跨团队合作和沟通能力,学习能力强,善于探索新技术,有较强的责任心和抗压能力。
工作职责
1、负责公司级数据仓库的架构设计、模型建设和商业分析,经营稽核指标体系开发,支持实时/离线分析核心代码编写。 2、负责对接业务,收集、整理并分析业务需求,规划和执行落地,保证按时交付高质量的结果。 3、配合企业商业分析师团队,增强企业商业情报分析能力,提升产品规划决策效率和准确性。 4、负责数据仓库及商业分析报表体系的维护,保障数据处理的质量和效率。
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
相关职位
社招技术类
工作职责: 负责58 EB级海量数据存储和亿级任务调度的Hadoop及其周边生态的规划和建设,主导近万台集群规模的跨机房架构与落地,实现在离线混部、存算分离、云化等技术方向创新应用,打造稳定高效的新一代大数据平台。 • 负责跨机房、在离线混部、存算分离、云化等架构设计、技术选型、技术难点攻关 • 带领团队对Hadoop及其周边生态进行定制开发 • 负责大规模Hadoop集群的深度性能优化 • 参与社区互动,积极引进社区重大特性和改进并反哺社区提升影响力
更新于 2022-03-09
社招3年以上TEG公共技术
1.负责大数据开发平台的设计实现,为公司各业务提供高效稳定的数据研发平台能力; 2.参与到需求评审、技术方案设计、编码实现、代码CR、功能测试等研发全流程工作; 3.与产品经理、测试运维等相关团队紧密协作,推动平台能力的高效快速落地; 4.持续优化系统架构,沉淀平台级公共服务组件,促进平台研发迭代效率,进一步提升系统性能和稳定性。
更新于 2025-08-28