蔚来数据开发工程师
社招全职5-7年数字技术地点:上海状态:招聘
任职要求
- 掌握维度建模设计方法,具备海量数据加工处理(ETL)经验,具备良好的SQL性能调优能力 - 熟悉数仓领域知识和技能,包括但不局限于:数据集市设计,元数据管理,数据质量 - 精通Spark,FLink 数据开发框架,有丰富实时数据的实际开发和性能调优经验 - 熟悉Hadoop,HBase,Kafka,ElasticSearch,…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
- 负责公司级核心数据资产的规划和建设,支撑核心业务场景设计和开发落地 - 深入各个业务领域,理解业务需求,带领和建设数仓团队,发挥数据对业务的价值 - 负责离线&实时数仓基础架构和落地,推进批流一体落地,对全生命周期的数据交付负责 - 参与数据产品与应用的数据研发,发掘数据商业价值,和产品技术团队一起打造极致体验的数据产品 - 为海量数据处理和分析提供高效解决方案,落地实时和离线需求,为基础开发者提供可靠技术支持 - 结合业务方向,深度挖掘数据需求,形成技术方案和标准,探索行业前沿技术
包括英文材料
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
开发框架+
[英文] Understanding Modern Development Frameworks: A Guide for Developers and Technical Decision-makers
https://www.freecodecamp.org/news/understanding-modern-development-frameworks-guide-for-devs/
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
还有更多 •••
相关职位
社招数据开发岗
1.负责按照业务需求建立并完善风控所需要的风控集市 ,参与模型结构设计、模型mapping开发、特征开发等工作; 2.负责自有数据、三方数据进行分层管理和加工,通过合理的数据抽象和建模,沉淀可复用的数据资产; 3.参与数据治理、数据质量、数据服务及数据产品等基础数据平台和设施建设。
更新于 2025-06-16北京
社招3年以上数据开发岗
1.参与京东外卖&秒送PB级数据仓库的建设,为各业务方提供完整、高效的数据支撑; 2.基于简单、易用、高效、可靠等原则建设离线数据仓库,支撑上层数据产品和分析师; 3.构建实时数据仓库,满足实时业务场景; 4.深入参与数据产品建设,为公司内外提供完善的数据解决方案; 5.满足公司各部门日常的数据需求。
更新于 2025-06-15上海
社招数据开发岗
1.深入理解电商平台业务,围绕场景构建分析模型,挖掘潜在问题和增长机会,助力业务发展; 2.完成平台业务的数据架构设计及实时和离线的数据开发工作; 3.对未来数据流架构和研发流程进行设计和落地,持续提升稳定性和研发效能。
更新于 2025-06-15北京