小鹏汽车大数据开发资深工程师/专家
社招全职5年以上地点:广州状态:招聘
任职要求
1.计算机或相关专业,本科及硕士以上学历,5年以上大数据相关经验; 2.具备扎实的数据结构及算法功底,精通Java/Scala/Python等至少一门编程语言,熟练运用各种常用算法和数据结构; 3.熟悉常规的分布式架构, 深入理解Flink,Spark,Kafka,Iceberg,Paimon,Starrocks,ElasticSearch,Clickhouse等大数据框架组件一种或多种,清楚原理和优化; 4.有产品级数据湖、数据质量、元数据管理等相关数据组件的实际经验,有大规模近实时数仓经验者优先; 5.熟悉分布式系统,微服务的应用的设计,了解大型服务端开发,有基于Docker、Kubernetes相关开发经验; 6.有很强的学习能力和问题解决能力,逻辑严密、思路清晰,有责任心,有创新精神; 7.良好的沟通能力和团队协同能力;能与他人合作,共同完成目标; 8.有自动驾驶或大型互联网公司相关从业经验优先。
工作职责
1.深挖数据价值,构建和维护车端信号数据仓库体系和数据指标体系,为算法和数据闭环提供框架支持; 2.参与构建批流统一的数据分析平台,支持百亿级自动驾驶感知和全栈数据的快速定位和分析; 3.参与平台架构规划,负责前沿技术的跟踪研究,工具链的选型测试,解决、攻克数据平台的核心技术难题; 4.建立监控和反馈指标,持续优化改进产品的架构及性能,保证PB级数仓的数据质量和平台稳定性。
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Iceberg+
https://iceberg.apache.org/spark-quickstart/
This guide will get you up and running with Apache Iceberg™ using Apache Spark™, including sample code to highlight some powerful features.
https://www.baeldung.com/apache-iceberg-intro
This tutorial will discuss Apache Iceberg, a popular open table format in today’s big data landscape.
https://www.youtube.com/watch?v=TsmhRZElPvM
You’ve probably heard about Apache Iceberg™—after all, it’s been getting a lot of buzz.
StarRocks+
https://docs.starrocks.io/docs/quick_start/
These Quick Start guides will help you get going with a small StarRocks environment.
https://itnext.io/introduction-to-starrocks-a-new-modern-analytical-database-1db2177d26e1
Recently, I had the opportunity to explore StarRocks which is the new kid in the block when talking about massive scale databases which are able to handle petabytes of data.
ElasticSearch+
https://www.youtube.com/watch?v=a4HBKEda_F8
Learn about Elasticsearch with this comprehensive course designed for beginners, featuring both theoretical concepts and hands-on applications using Python (though applicable to any programming language). The course is structured in two parts: first covering essential Elasticsearch fundamentals including index management, document storage, text analysis, pipeline creation, search functionality, and advanced features like semantic search and embeddings; followed by a practical section where you'll build a real-world website using Elasticsearch as a search engine, working with the Astronomy Picture of the Day (APOD) dataset to implement features such as data cleaning pipelines, tokenization, pagination, and aggregations.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
微服务+
https://learn.microsoft.com/en-us/training/modules/dotnet-microservices/
Microservice applications are composed of small, independently versioned, and scalable customer-focused services that communicate with each other by using standard protocols and well-defined interfaces.
https://microservices.io/
Microservices - also known as the microservice architecture - is an architectural style that structures an application as a collection of two or more services.
https://spring.io/microservices
Building small, self-contained, ready to run applications can bring great flexibility and added resilience to your code.
https://www.ibm.com/think/topics/microservices
Microservices, or microservices architecture, is a cloud-native architectural approach in which a single application is composed of many loosely coupled and independently deployable smaller components or services.
https://www.youtube.com/watch?v=CqCDOosvZIk
https://www.youtube.com/watch?v=hmkF77F9TLw
Learn about software system design and microservices.
Docker+
https://www.youtube.com/watch?v=GFgJkfScVNU
Master Docker in one course; learn about images and containers on Docker Hub, running multiple containers with Docker Compose, automating workflows with Docker Compose Watch, and much more. 🐳
https://www.youtube.com/watch?v=kTp5xUtcalw
Learn how to use Docker and Kubernetes in this complete hand-on course for beginners.
Kubernetes+
https://kubernetes.io/docs/tutorials/kubernetes-basics/
This tutorial provides a walkthrough of the basics of the Kubernetes cluster orchestration system.
https://kubernetes.io/zh-cn/docs/tutorials/kubernetes-basics/
本教程介绍 Kubernetes 集群编排系统的基础知识。每个模块包含关于 Kubernetes 主要特性和概念的一些背景信息,还包括一个在线教程供你学习。
https://www.youtube.com/watch?v=s_o8dwzRlu4
Hands-On Kubernetes Tutorial | Learn Kubernetes in 1 Hour - Kubernetes Course for Beginners
https://www.youtube.com/watch?v=X48VuDVv0do
Full Kubernetes Tutorial | Kubernetes Course | Hands-on course with a lot of demos
自动驾驶+
https://www.youtube.com/watch?v=_q4WUxgwDeg&list=PL05umP7R6ij321zzKXK6XCQXAaaYjQbzr
Lecture: Self-Driving Cars (Prof. Andreas Geiger, University of Tübingen)
https://www.youtube.com/watch?v=NkI9ia2cLhc&list=PLB0Tybl0UNfYoJE7ZwsBQoDIG4YN9ptyY
You will learn to make a self-driving car simulation by implementing every component one by one. I will teach you how to implement the car driving mechanics, how to define the environment, how to simulate some sensors, how to detect collisions and how to make the car control itself using a neural network.
相关职位
社招3年以上
1. 负责基于大模型、强化学习等AI技术,对新能源汽车整车能耗优化、主动安全控制、驾乘舒适性等功能的相关算法方案设计与开发; 2. 深入参与AI模型训练、微调及部署流程,结合车辆实际工况和业务需求,推动AI技术在整车端的创新应用,并实现量产落地; 3. 配合动力总成、电控、底盘等相关部门,基于车载数据开展能耗预测、智能驾驶策略、异常检测、舒适性优化等多维度创新课题攻关; 4. 跟进AI前沿技术发展趋势,持续推动包括多模态、大模型、强化学习等新算法在新能源汽车行业的落地实践; 5. 参与相关测试验证,推动算法向实际车辆移植、集成和优化,解决工程化中的技术难题; 6. 支持专利/论文的技术文档输出及内部技术分享。
更新于 2025-06-05
社招5年以上技术团队AI &
1.负责离线和实时数据仓库各层(如ODS、DWD、DWS、ADS)的模型设计、开发与优化,支撑数据分析和业务应用; 2.负责集团财务数据分析系统的数仓开发及报表开发; 3.独立完成复杂业务逻辑的数据ETL开发、任务调度与运维监控,保障数据加工流程的准确性和稳定性; 4.建立并监控数据质量规则,主动发现、跟踪并解决数据质量问题,确保数据的可靠性和可信度。
更新于 2025-09-08
社招3年以上
工作内容: 1.负责策划大数据如何赋能业务研发,协助专业领域提供数据和工具支持; 2.负责动力业务工具链开发,赋能业务提效; 3.负责通过AI算法赋能动力业务,推动业务高效发展; 4.负责多模态数据(文本、图像、音频)采集、清洗、预处理;
更新于 2024-08-28