希音高级/资深数据开发工程师-大数据应用
社招全职2年以上信息技术类地点:南京 | 上海 | 深圳 | 北京状态:招聘
任职要求
1.计算机或信息技术相关专业,大学本科及以上学历; 2.2年以上大数据数仓开发相关工作经验;(S6需要限定4年及以上) 3.熟悉Python/Java/Shell其中一种语言; 4.熟悉大数据系统组件(如Hive、MapReduce、Spark、HBase等),具备编写、优化复杂SQL的能力; 5.了解实时处理技术相关组件(如Kafka、Flink等); 6.负责实施过PB级数据仓库设计优先; 7.具备流量分发、埋点相关经验的加分; 大数据用户:base南京上海深圳都OK 岗位职责: 1.负责广告投放及后续站内相关业务的数仓模型建设和设计;并根据需求变化和业务发展,持续优化模型; 2.结合公司相关配套数据产品,完成实际代码开发、部署,并对数据质量进行管理和优化; 3.构建自助分析数据资产,通过数据+产品,提供低成本的数据产品化能力; 岗位要求: 1.计算机或信息技术相关专业,大学本科及以上学历; 2.2年以上大数据数仓开发相关工作经验; 3.熟悉Python/Java/Shell其中一种语言; 4.熟悉大数据系统组件(如Hive、MapReduce、Spark、HBase等),具备编写、优化复杂SQL的能力; 5.了解实时处理技术相关组件(如Kafka、Flink等); 6.负责实施过PB级数据仓库设计优先; 7.具备广告系统业务相关经验的加分;
工作职责
大数据场景base深圳北京都OK 1.负责算法流量分配、前端页面迭代相关业务的数仓模型建设和设计;并根据需求变化和业务发展,持续优化模型; 2.结合公司相关配套数据产品,完成实际代码开发、部署,并对数据质量进行管理和优化; 3.构建自助分析数据资产,通过数据+产品,提供低成本的数据产品化能力;
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Bash+
[英文] The Bash Guide
https://guide.bash.academy/
A quality-driven guide through the shell's many features.
https://www.youtube.com/watch?v=tK9Oc6AEnR4
Understanding how to use bash scripting will enhance your productivity by automating tasks, streamlining processes, and making your workflow more efficient.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
MapReduce+
https://www.youtube.com/watch?v=bcjSe0xCHbE
https://www.youtube.com/watch?v=cHGaQz0E7AU
In this video I explain the basics of Map Reduce model, an important concept for any software engineer to be aware of.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
数据仓库+
https://www.youtube.com/watch?v=9GVqKuTVANE
From Zero to Data Warehouse Hero: A Full SQL Project Walkthrough and Real Industry Experience!
https://www.youtube.com/watch?v=k4tK2ttdSDg
广告系统+
https://github.com/InteractiveAdvertisingBureau/openrtb2.x
Real-time Bidding (RTB) is a way transacting media that allows an individual ad impression to be put up for bid in real-time.
https://people.eecs.berkeley.edu/~jfc/DataMining/SP12/lecs/lec12.pdf
https://wnzhang.net/teaching/ee448/slides/11-computational-ads.pdf
If a bidder bids higher than his true value, then...
相关职位
社招3年以上信息技术类
1、分析业务需求,建设数据库仓库,对业务部门提供数据支持; 2、参与数据源分析,完成大数据平台与各业务系统的数据对接; 3、完成基于大数据技术平台基础上的数据仓库设计和ETL开发; 4、调研相关技术,优化大数据开发流程,规划大数据平台应用。
更新于 2025-04-16
社招5年以上信息技术类
1、主要负责大数据可视化平台核心功能开发; 2、负责产品的性能优化,让产品拥有更优质的用户体验,熟悉浏览器运行机制,熟悉nodejs; 3、解决小组成员遇到的问题,负责小组成员日常代码review。
更新于 2025-10-11
社招技术团队开发
1. 大数据可视化配置平台开发: 负责大数据可视化配置平台的设计与开发,独立完成核心功能模块的实现。 针对大数据场景下的SQL查询进行优化,提升数据查询和处理的性能。 设计和实现高性能、高可用的Java后端服务,支撑可视化平台的稳定运行。 2. 技术研究与创新: 跟踪大数据和可视化技术的最新发展趋势,持续优化平台功能。 研究并引入新技术,提升系统的性能、可扩展性和用户体验。 解决技术难题,推动技术创新在实际项目中的应用。 3. 团队协作与指导: 与产品经理、前端开发、数据分析师等团队成员紧密合作,确保项目高效推进。 参与技术方案的讨论和设计,提出可行性建议并推动落地。 分享技术经验,帮助团队成员共同成长。
更新于 2025-03-11