京东数据开发工程师
社招全职3年以上数据开发岗地点:上海状态:招聘
任职要求
1.计算机相关专业,3年以上数仓开发实施经验,能够根据业务设计数据仓库模型,并对数据模型进行管理,保证数据的产出和质量; 2.精通sql开发,有较丰富的Hive sql性能调优经验,能进行udf开发者优先; 3.有Spark Streaming/Flink/Storm等实时数据开发经验者优先; 4.熟悉Mysql、Hbase、Redis、Doris、ClickHouse等存储引擎的架构,能够根据应用场景进行技术选型; 5.学习能力强,拥有优秀的逻辑思维能力、良好的理解和表达能力、工作认真、负责。
工作职责
1.参与京东外卖&秒送PB级数据仓库的建设,为各业务方提供完整、高效的数据支撑; 2.基于简单、易用、高效、可靠等原则建设离线数据仓库,支撑上层数据产品和分析师; 3.构建实时数据仓库,满足实时业务场景; 4.深入参与数据产品建设,为公司内外提供完善的数据解决方案; 5.满足公司各部门日常的数据需求。
包括英文材料
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
MySQL+
https://juejin.cn/post/7190306988939542585
这是一篇 MySQL 通关一篇过硬核经验学习路线,包括数据库相关知识,SQL语句的使用,数据库约束,设计等。
[英文] MySQL Tutorial
https://www.mysqltutorial.org/
your go-to resource for mastering MySQL in a fast, easy, and enjoyable way.
https://www.youtube.com/watch?v=5OdVJbNCSso
MySQL SQL tutorial for beginners
https://www.youtube.com/watch?v=7S_tz1z_5bA
This beginner-friendly course teaches you SQL from scratch.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Redis+
[英文] Developer Hub
https://redis.io/dev/
Get all the tutorials, learning paths, and more you need to start building—fast.
https://www.runoob.com/redis/redis-tutorial.html
REmote DIctionary Server(Redis) 是一个由 Salvatore Sanfilippo 写的 key-value 存储系统,是跨平台的非关系型数据库。
https://www.youtube.com/watch?v=jgpVdJB2sKQ
In this video I will be covering Redis in depth from how to install it, what commands you can use, all the way to how to use it in a real world project.
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
相关职位
社招数据开发岗
1.负责按照业务需求建立并完善风控所需要的风控集市 ,参与模型结构设计、模型mapping开发、特征开发等工作; 2.负责自有数据、三方数据进行分层管理和加工,通过合理的数据抽象和建模,沉淀可复用的数据资产; 3.参与数据治理、数据质量、数据服务及数据产品等基础数据平台和设施建设。
更新于 2025-06-16
社招数据开发岗
1.深入理解电商平台业务,围绕场景构建分析模型,挖掘潜在问题和增长机会,助力业务发展; 2.完成平台业务的数据架构设计及实时和离线的数据开发工作; 3.对未来数据流架构和研发流程进行设计和落地,持续提升稳定性和研发效能。
更新于 2025-06-15
社招数据开发岗
1.深入理解电商平台业务,围绕场景构建分析模型,挖掘潜在问题和增长机会,助力业务发展; 2.完成平台业务的数据架构设计及实时和离线的数据开发工作; 3.对未来数据流架构和研发流程进行设计和落地,持续提升稳定性和研发效能。
更新于 2025-06-15