顺丰大数据应用研发工程师
社招全职5-10年地点:深圳状态:招聘
任职要求
1.统招本科及以上学历,计算机、数学、统计、大数据、人工智能或相关领域专业; 2.具备3年以上大数据、机器学习、数据挖掘及分析或相关领域的工作经验,有实际项目经验者优先; 3.掌握大数据技术栈,熟悉Java/Scala/Python至少一门语言,较强的Python/SQL/ETL开发与调优能力,熟练掌握Hadoop/Hive/Spark/Flink等大数据组件; 4.逻辑清晰,主动进取,对所负责工作有Owner意识,并能自我驱动成长; 5.对新技术和新方法保持好奇与敏感,能快速学习、掌握和应用新技术。 加分项: ● 熟悉大模型技术栈(如LangChain/LlamaIndex),有大模型微调经验(LoRA/P-Tuning等参数高效微调) ● 有海量数据场景下的模型性能调优经验 ● 熟悉图数据库、向量数据库
工作职责
1.针对物流业务场景,利用大数据分析、数据挖掘、机器学习相关算法,解决业务需求,确保算法的效率和准确性; 2.利用AI、大数据分析、大模型等技术能力,解决企业内部各种低效率高人工成本场景类问题; 3.参与全链路数据开发,包括数据采集、日志解析、数据同步、数据清洗、数据模型设计、离线/实时开发、数据服务化、可视化和数据治理等工作; 4.探索大模型技术与现有数据架构融合,实现基于LLM的智能问答、知识图谱构建等创新应用; 5.与业务部门对接,收集需求、分析需求,并跟进方案实施及应用推广,确保数据解决方案和落地效果满足业务需求; 6.进行数据的应用价值研究和分析,提供数据洞察和决策支持,为业务部门提供数据驱动的解决方案。
包括英文材料
学历+
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
数据挖掘+
https://www.youtube.com/watch?v=-bSkREem8dM
Database vs Data Warehouse vs Data Lake
https://www.youtube.com/watch?v=7rs0i-9nOjo
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Scala+
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
ETL+
https://www.ibm.com/think/topics/etl
ETL—meaning extract, transform, load—is a data integration process that combines, cleans and organizes data from multiple sources into a single, consistent data set for storage in a data warehouse, data lake or other target system.
https://www.youtube.com/watch?v=OW5OgsLpDCQ
It explains what ETL is and what it can do for you to improve your data analysis and productivity.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
LangChain+
https://python.langchain.com/docs/tutorials/
New to LangChain or LLM app development in general? Read this material to quickly get up and running building your first applications.
https://www.freecodecamp.org/news/beginners-guide-to-langchain/
LangChain is a popular framework for creating LLM-powered apps.
LlamaIndex+
https://developers.llamaindex.ai/python/framework/getting_started/starter_example/
This tutorial will show you how to get started building agents with LlamaIndex.
https://www.ibm.com/think/tutorials/llamaindex-rag
LlamaIndex is a powerful open source framework that simplifies the process of building RAG pipelines.
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
相关职位
社招3年以上A29863
1、负责抖音生活服务服务商平台核心数据系统服务端研发工作; 2、负责持续改善服务质量,提高系统稳定性和高可用性,减少线上反馈; 3、深入与产品形成协同,将技术应用到业务场景,达成产品业务目标; 4、对所负责的模块范围,有自己的技术规划。
更新于 2024-12-24
社招A105572B
1、参与智能信息服务平台的研发,打造人人可用的企业级智能信息助理,服务中小企业客户和办公人群,提升用户获取高质量信息的效率与体验; 2、设计和实现基于全网数据的智能Agent系统,打造从信息理解、知识建模到任务规划的端到端解决方案; 3、构建稳定高效的信息生产链路,涵盖多源信息聚合、内容理解、存储索引与推送分发机制; 4、深入理解用户/客户的信息需求场景,驱动系统向更智能化、个性化、自适应的方向演进; 5、与产品、算法及运营等团队紧密协作,共同探索大模型时代下信息服务产品的新范式与商业价值。
更新于 2025-06-13
社招3年以上程序&技术类
1、负责大模型数据相关应用的开发与优化,保障数据处理与模型推理的高效稳定; 2、深入研究与应用大模型相关技术,为数据处理和模型推理提供技术支持; 3、与前端、算法团队紧密合作,确保大模型应用的功能实现与性能优化; 4、持续跟进大模型领域的新技术和发展趋势,提出系统改进和技术迭代建议。