网易资深数据开发工程师(杭州)
社招全职5-10年网易职能地点:杭州状态:招聘
任职要求
1、本科及以上学历,计算机、软件工程或相关专业出身,8年以上数据仓库开发工作经验,精通业务数据梳理、数据仓库模型设计与ETL开发.有较强的开发调优能力。 2、精通数据仓库理论体系,对数据处理、维度建模、数据治理等有深刻的认识和实战经验。能够快速地理解业务模型及数据模型。 3、精通sql,熟悉Oracle、Mysql、mongo、Doris等主流数据库的一种或多种,熟悉数据库的原理,熟练数据库相关开发工作并注重sql的性能。 4、熟悉Linux开发环境,熟悉Sql开发.熟练使用Python、shell等脚本语言。 5、熟悉Hadoop、Hive、Hbase、spark、flink等工作原理、开发、配置调优。 6、有数据挖掘及分析相关工作经验尤佳。 7、逻辑思维能力强、有责任心、对新技术保持热情、抗压能力强。
工作职责
1、负责网易集团财经数据中台的数仓规划与设计 2、完成相关原始数据采集、清洗、整理、去重和治理,保证数据及时性、完整性、一致性和准确性。 3、参与业务需求调研,根据业务需求设计数据仓库维度模型,并完成数据模型开发,沉淀数据指标。 4、持续改进优化ETL、分析处理等问题,对结构化的数据做数据分析; 5、对项目开发进度、代码质量进行管控、完成技术文档的沉淀。
包括英文材料
学历+
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
数据治理+
https://www.ibm.com/think/topics/data-governance
Data governance is the data management discipline that focuses on the quality, security and availability of an organization’s data.
https://www.youtube.com/watch?v=uPsUjKLHLAg
Building data fabric eliminates the technological complexities of data governance so users can connect to the right data at the right time, regardless of where it resides.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Oracle+
[英文] Oracle Tutorial
https://www.oracletutorial.com/
On this website, you can learn Oracle Database fast and easily.
https://www.youtube.com/watch?v=QHYuuXPdQNM&list=PL_c9BZzLwBRJ8f9-pSPbxSSG6lNgxQ4m9
MySQL+
https://juejin.cn/post/7190306988939542585
这是一篇 MySQL 通关一篇过硬核经验学习路线,包括数据库相关知识,SQL语句的使用,数据库约束,设计等。
[英文] MySQL Tutorial
https://www.mysqltutorial.org/
your go-to resource for mastering MySQL in a fast, easy, and enjoyable way.
https://www.youtube.com/watch?v=5OdVJbNCSso
MySQL SQL tutorial for beginners
https://www.youtube.com/watch?v=7S_tz1z_5bA
This beginner-friendly course teaches you SQL from scratch.
MongoDB+
https://learnxinyminutes.com/mongodb/
MongoDB is a NoSQL document database for high volume data storage.
https://studio3t.com/academy/#courses
The fastest way to learn MongoDB
https://www.youtube.com/watch?v=c2M-rlkkT5o
This video will give you and introduction to MongoDB in 1 Hour. Afterwards I recommend exploring aggregation, replication, and sharding.
https://www.youtube.com/watch?v=ExcRbA7fy_A&list=PL4cUxeGkcC9h77dJ-QJlwGlZlTd4ecZOA
You'll learn how to use MongoDB (a NoSQL database) from scratch. You'll also learn how to integrate it into a simple Node.js API.
Doris+
https://doris.apache.org/docs/gettingStarted/what-is-apache-doris
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Bash+
[英文] The Bash Guide
https://guide.bash.academy/
A quality-driven guide through the shell's many features.
https://www.youtube.com/watch?v=tK9Oc6AEnR4
Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.
脚本+
[英文] Scripting language
https://en.wikipedia.org/wiki/Scripting_language
https://zhuanlan.zhihu.com/p/571097954
一个脚本通常是解释执行而非编译。脚本语言通常都有简单、易学、易用的特性,目的就是希望能让程序员快速完成程序的编写工作。
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
数据挖掘+
https://www.youtube.com/watch?v=-bSkREem8dM
Database vs Data Warehouse vs Data Lake
https://www.youtube.com/watch?v=7rs0i-9nOjo
相关职位
社招4年以上A185218
1、参与游戏业务的离线和实时数据仓库建设和运维; 2、参与数据ETL流程的优化并解决ETL相关技术问题; 3、参与复杂数据链路依赖和多元数据内容生态下的数据治理工作; 4、依托公司成熟的大数据方案,在业务快速落地数据解决方案。
更新于 2025-06-18
社招A116960
1、对业务问题进行合理抽象和设计,设计和开发高质量的底层数据体系,驱动业务快速健康发展; 2、在数据仓库内实施收集,清洗和规约等工作; 3、提供面向业务的数据服务,完成数据指标的统计,多维分析和展现; 4、根据业务和产品情况,抽象业务逻辑,搭建和开发大数据平台; 5、参与数据平台架构设计,核心开发任务。
更新于 2023-06-28
社招3年以上诚云科技
1、企业数据体系建设 负责企业级数据架构设计,制定数据标准与规范,支撑业务决策与智能化应用。 搭建数据采集与处理流程,整合多源数据(内部系统、外部API、公开数据库等),确保数据质量与一致性。 2、外部数据获取与处理 通过API接口或第三方数据服务,获取高质量外部数据(如行业趋势、市场动态、竞品信息)。 设计数据清洗规则与自动化脚本,处理缺失值、异常值及格式标准化,输出结构化数据资产。 3、数据建模与分析 基于业务场景(如用户画像、风险预测、供应链优化)构建统计模型或机器学习模型(如分类、回归、聚类)。 开发可复用的数据分析工具链,支持实时/离线分析,输出可视化报告或API接口供业务调用。
更新于 2025-05-28