顺丰大数据开发工程师
社招全职3-5年地点:武汉状态:招聘
任职要求
1. 技术背景与经验: • 本科及以上学历,计算机、软件工程、GIS 或相关专业。 • 3-5年及以上大数据实际开发经验,具备独立进行中大型大数据系统架构设计与落地能力。 2. 核心技术栈(硬性条件): • 精通 Spark 架构及原理,熟练使用 PySpark 或 Scala 进行复杂业务逻辑开发,有丰富的 Spark Streaming / Structured Streaming 调优经验。 • 熟练掌握 Hadoop、Hive、HBase、Flink 等大数据生态组件,具备扎实的 SQL 功底及调…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1. 大数据架构与数仓设计:负责时空大数据平台的架构设计与核心模块开发。主导高性能分布式数仓(Data Warehouse)的模型设计、ETL开发及任务调优。 2. 数据流水线开发:使用 Spark / PySpark(或 Scala)进行海量位置数据、轨迹数据及复杂地址数据的清洗、聚合与空间计算,确保数据管线的高可用与高性能。 3. 技术保障与运维:深度参与 Hadoop、Hive、HBase 组成的计算与存储集群的日常开发与调试,负责线上高并发/海量数据场景下的性能瓶颈分析与疑难问题排查。 4. 数据治理与标准落地:精通传统及非大数据环境下的数据治理技术。针对多源、异构的地址与空间数据,进行质量监控、数据标准化及血缘图谱建设。 5. 跨团队协作:与地图产品经理、空间算法科学家深度协同,将前沿的 AI/时空算法模型转化为工业级的大数据工程落地,共同攻坚业务难题。
包括英文材料
学历+
GIS+
https://www.osgeo.org/resources/learn-gis-free-complete-course/
Learning GIS, especially a modern GIS approach, can seem overwhelming, but this video explains how to take a four-step process to learn modern GIS and some tools to help you get started!
https://www.youtube.com/watch?v=n9dDsYLIx1c
Learning GIS, especially a modern GIS approach, can seem overwhelming, but this video explains how to take a four-step process to learn modern GIS and some tools to help you get started!
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
系统设计+
https://roadmap.sh/system-design
Everything you need to know about designing large scale systems.
https://www.youtube.com/watch?v=F2FmTdLtb_4
This complete system design tutorial covers scalability, reliability, data handling, and high-level architecture with clear explanations, real-world examples, and practical strategies.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Scala+
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
还有更多 •••