百度大数据开发实习生(J72036)
实习兼职小度科技地点:北京状态:招聘
任职要求
-计算机相关专业,本科及以上学历 -对Spark及Hadoop技术有深入了解 -熟悉Python/Java/Scala/Php等编程语言,熟练使用SQL,有良好的编码习惯,对分布式有深刻理解 -了解Windows、Unix、Linux等主流操作系统原理,熟练运用系统层支持应用开发 -技术视野开阔,有强烈的上进心和求知欲,善于学习和运用新知识,勇于解决难题 -善于沟通和逻辑表达,拥有优秀的分析问题和解决问题的能力,良好的团队合作精神和积极主动的沟通意识 -有激情,具有自我驱动力,追求卓越 -实习时间至少3个月以上 具有以下条件者优先: -计算机领域相关的编程大赛获奖、专业期刊发表文章或者有发明专利等 -具备大数据云平台、计算存储平台、可视化开发平台经验,熟悉软件工程开发流程 -具备专业领域的计算机知识和技能: Storm/Hive/Hbase/Storm/Kafka等
工作职责
-负责构建大数据离线和实时流分析平台和工具 -参与海量数据的存储、查询 -参与支撑业务的数据模型建设及数据指标的计算 -运用Hadoop、Spark、ES等分布式计算和存储平台
包括英文材料
学历+
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
编程规范+
[英文] Google Style Guides
https://google.github.io/styleguide/
Every major open-source project has its own style guide: a set of conventions (sometimes arbitrary) about how to write code for that project. It is much easier to understand a large codebase when all the code in it is in a consistent style.
Windows+
[英文] Windows 10 Tutorial
https://www.tutorialspoint.com/windows10/index.htm
This tutorial gives you all the indepth information on this new operating system and its procedures.
Unix+
[英文] The UNIX® Standard
https://www.opengroup.org/membership/forums/platform/unix
https://www.youtube.com/watch?v=IrDUcdpPmdI
UNIX is an operating system which was first developed in the 1970s, and has been under constant development ever since.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Kafka+
https://developer.confluent.io/what-is-apache-kafka/
https://www.youtube.com/watch?v=CU44hKLMg7k
https://www.youtube.com/watch?v=j4bqyAMMb7o&list=PLa7VYi0yPIH0KbnJQcMv5N9iW8HkZHztH
In this Apache Kafka fundamentals course, we introduce you to the basic Apache Kafka elements and APIs, as well as the broader Kafka ecosystem.
相关职位
实习核心本地商业-基
1.根据履约业务场景,与各业务团队深入合作,搭建满足配送一线管理团队,商分,运营,策略等团队日常运营及分析的运营数据体系; 2.负责规划和设计运营数据工具和应用系统或产品;协同业务方,PM,RD,QA等资源,完成履约数据产品推动落地和持续迭代; 3.通过抽象建设覆盖经营、成本、运力、体验等主题的数据分析方法和分析思路,赋能业务发展。 4.负责履约数仓建设,协同业务后线、运营,提升经营效率。 5.负责数据预警平台系统能力建设,维护系统稳定性。
更新于 2025-08-24
实习核心本地商业-基
1.根据外卖业务场景,与各业务团队深入合作,搭建满足外卖一线管理团队,商分,运营,策略等团队日常运营及分析的运营数据体系; 2.负责规划和设计数据工具和应用系统或产品;协同业务方,PM,RD,QA等资源,完成外卖数据产品推动落地和持续迭代; 3.通过抽象建设覆盖经营、交易、补贴、营销、流量等主题的数据分析方法和分析思路,赋能业务发展。 4.负责外卖数仓建设,协同业务后线、运营,提升经营和运营效率。 5.负责数据平台系统、数据运维系统能力建设,维护系统稳定性。 6.负责画像平台系统、AI敏捷开发平台能力建设
更新于 2025-10-12