
智能互联千问事业部-大数据计算专家-杭州
社招全职3年以上技术类-开发地点:杭州状态:招聘
任职要求
1、精通分布式系统原理,深入理解计算引擎(如 Spark、Flink、Ray、Daft等)的执行模型、任务调度、Shuffle 机制及容错设计;具备大规模数据处理系统的实战经验; 2、熟悉 CPU/GPU 异构计算架构,具备在 GPU 加速数据处理(如视频解码、特征提取)、内存管理、向量化执行或算子融合等方向深度性能调优的实际项目经验; 3、有构建开发者工具链或编程接口(SQL-like/Python SDK)的经验,关注用户体验,能够平衡系统能力与易用性,推动从本地调试到生产部署的端到端开发闭环…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
1、负责大规模多模态数据计算引擎的核心架构设计与执行计划优化,构建面向异构计算(CPU/GPU)的高效任务调度机制与执行流水线,解决PB级多模态数据(图像、音频、视频等)处理过程中的关键性能瓶颈; 2、应对PB级数据在Shuffle、Join、Aggregation等场景中面临的内存、I/O与存储资源挑战,攻克每日PB级流批一体数据处理过程中的系统稳定性难题了; 3、提供SQL-like与Python双端编程接口,打造从本地开发调试到生产级分布式计算的一站式开发体验,持续提升开发效率与工程易用性; 4、应对百PB级结构化数据及多模态数据入湖场景下的高效存储与高吞吐I/O挑战;实现多租户I/O与存储资源的有效隔离;构建完善的数据治理体系,保障数据质量,防范数据腐化风险。
包括英文材料
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Ray+
https://github.com/ray-project/ray
Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://www.youtube.com/watch?v=FhXfEXUUQp0
In this video, I'll teach you everything you need to know about Apache Ray!
https://www.youtube.com/watch?v=fMiAyj2kgac
Using powerful machine learning algorithms is easy using Ray.io and Python.
https://www.youtube.com/watch?v=q_aTbb7XeL4
Parallel and Distributed computing sounds scary until you try this fantastic Python library.
性能调优+
https://goperf.dev/
The Go App Optimization Guide is a series of in-depth, technical articles for developers who want to get more performance out of their Go code without relying on guesswork or cargo cult patterns.
https://web.dev/learn/performance
This course is designed for those new to web performance, a vital aspect of the user experience.
https://www.ibm.com/think/insights/application-performance-optimization
Application performance is not just a simple concern for most organizations; it’s a critical factor in their business’s success.
https://www.oreilly.com/library/view/optimizing-java/9781492039259/
Performance tuning is an experimental science, but that doesn’t mean engineers should resort to guesswork and folklore to get the job done.
SQL+
https://liaoxuefeng.com/books/sql/introduction/index.html
什么是SQL?简单地说,SQL就是访问和处理关系数据库的计算机标准语言。
https://sqlbolt.com/
Learn SQL with simple, interactive exercises.
https://www.youtube.com/watch?v=p3qvj9hO_Bo
In this video we will cover everything you need to know about SQL in only 60 minutes.
还有更多 •••
相关职位

社招3年以上技术类-开发
1.负责基于 LLM 和 Agent 框架(如 LangGraph, CrewAI, AutoGPT 等)设计并实现 SRE 智能体,构建具备感知、推理、规划、执行与反思能力的闭环运维系统; 2.深度拆解故障排查、容量规划、性能调优等运维场景,利用大模型重构工作流,实现从异常检测、根因分析到故障自愈的端到端自动化; 3.负责构建运维领域专业知识库,优化 RAG链路,提升 Agent 在处理复杂领域问题时的准确性和专业度; 4.探索多 Agent 协同机制,设计并实现针对复杂运维任务的任务分发、角色协作与共识协议; 5.持续优化智能运维平台的性能与扩展性,确保在高并发、超大算力规模环境下 AI 决策的实时性与稳定性。
更新于 2026-04-06杭州|广州

社招3年以上技术类-开发
1、参与设计并实现高性能、可扩展、分布式大数据处理平台,通过数据驱动模型训练,支撑夸克智能语音相关业务算法生产与高效迭代; 2、与算法工程师密切配合,理解深度学习模型研发流程,负责/参与前沿模型研究中数据解决方案的设计、开发和维护; 3、通过AI能力来赋能数据建设,持续提升平台数据生产效率、易用性、降低算法使用成本
更新于 2026-04-06北京|杭州

社招5年以上技术类-开发
1、负责语音方向AI智能体应用的工程研发与团队管理,确保系统高效迭代与产品高质量交付; 2、协同产品与算法团队,推动语音智能体应用的技术演进与业务落地; 3、参与客户技术交流,管理客户技术预期,提升客户满意度与项目交付效果; 4、制定语音应用架构的中长期演进规划,持续优化全链路语音体验与系统稳定性。
更新于 2026-04-06北京|杭州