快手高级Java开发工程师(生产平台)-【数据平台】
社招全职5-10年D11431地点:北京状态:招聘
任职要求
1、本科及以上学历,计算机相关专业,三年以上工作经验; 2、了解大模型相关的概念、框架(如LangChain等)和技术(如提示词工程、RAG等)、主流的代码生成工具(如Cursor等); 3、熟悉Hive、Spark、Flink、Clickhouse、Hudi等开源大数据计算和分析引擎; 4、熟悉主流的Java开源框架(如Netty、 Spring等),热爱技术,对代码质量和开发规范有近乎苛刻的要求; 5、思维活跃,善于沟通与团队协作,能够与产品、用户等良好沟通,积极主动推进项目建设; 6、有大模型应用的建设经验(如找数 Agent、Text2SQL Agent等)或 主流大模型(Qwen、GPT、DeepSeek)使用和微调经验优先。
工作职责
1、主导(参与)规划和设计快手新一代 Data + AI 生产管治平台的后端技术体系以及软件架构,包括 离线/实时开发平台、数据安全、数据地图、大模型数据同步/任务调度等系统; 2、充分利用模型微调、提示词工程、RAG等大模型技术构建智能开发 / 运维 / 治理等生产智能化能力; 3、充分利用微服务、容器化等技术构建高可用、高扩展和低耦合高内聚的数据中台服务; 4、了解业界相关技术体系,为快手数据产品研发引入创造性的技术方案,解决面临的各种复杂问题和挑战。
包括英文材料
学历+
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
LangChain+
https://python.langchain.com/docs/tutorials/
New to LangChain or LLM app development in general? Read this material to quickly get up and running building your first applications.
https://www.freecodecamp.org/news/beginners-guide-to-langchain/
LangChain is a popular framework for creating LLM-powered apps.
RAG+
https://www.youtube.com/watch?v=sVcwVQRHIc8
Learn how to implement RAG (Retrieval Augmented Generation) from scratch, straight from a LangChain software engineer.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Hudi+
[英文] Spark Quick Start
https://hudi.apache.org/docs/quick-start-guide
we will walk through code snippets that allows you to insert, update, delete and query a Hudi table.
https://www.oreilly.com/library/view/apache-hudi-the/9781098173821/
Overcome challenges in building transactional guarantees on rapidly changing data by using Apache Hudi.
https://www.youtube.com/watch?v=pyK18sDYnS0
In this video, I'll introduce you to one of the most popular Data Lake solutions out there, Apache Hudi!
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Spring+
https://liaoxuefeng.com/books/java/spring/index.html
Spring是一个支持快速开发Java EE应用程序的框架。它提供了一系列底层容器和基础设施,并可以和大量常用的开源框架无缝集成,可以说是开发Java EE应用程序的必备。
https://spring.io/guides/gs/rest-service
https://spring.io/quickstart
Level up your Java code and explore what Spring can do for you.
AI agent+
https://www.ibm.com/think/ai-agents
Your one-stop resource for gaining in-depth knowledge and hands-on applications of AI agents.
GPT+
https://www.youtube.com/watch?v=kCc8FmEb1nY
We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI's GPT-2 / GPT-3.
相关职位
社招2年以上
1、负责淘天基础服务相关技术架构设计,以产品化的方式服务业务;负责基础服务的稳定性和可靠性,重大活动如双11的保障 2、负责稳定性产品的AI Agent研发,如Crash平台/监控平台的Agent流程编排和方案设计,能够设计并实现基础服务领域AI Agent,并在生产环境下运用和提升效率 3、参与制定技术产品的架构发展规划,评审应用架构方案,负责核心功能模块的架构代码
更新于 2025-09-28
社招3年以上CSIG技术
1.负责腾讯地图道路数据生产核心平台的开发,建设高质量的数据生产平台; 2.负责道路数据处理策略的开发与优化,提升数据生产的智能化水平; 3.负责包括新技术的应用、架构的优化和产品需求的迭代等核心开发工作; 4.负责数据生产系统性能优化工作。
更新于 2025-05-21
社招技术类
1. 负责跨境广告投放中台系统,抽象异构媒体平台API差异,构建统一投放工作流; 2. 搭建素材全生命周期管理平台,统筹多模态素材生产流水线,跨地域合规适配流转,提升产出效率; 3. 优化新老用户的转化链路,提升转化率;同时针对不同媒体平台,推进媒体链路优化; 4. 深入理解投放链路,优化投放策略,对各种策略实验能够完整分析洞察,数据驱动决策。
更新于 2025-06-19