夸克智能信息-夸克搜索离线开发专家-杭州
社招全职1年以上技术类-开发地点:杭州状态:招聘
任职要求
1. 具备较强的学习能力和意愿;具备良好的沟通能力,强烈的责任心和团队合作精神 2. 具备较强的编程基本功。熟悉 C++/Python/Java, 有扎实的数据结构、基础算法、网络、操作系统基础 3. 具备丰富的分布式系统开发经验、性能优化经验,有百亿以上的大规模数据处理经验 加分项: 1. 有搜索引擎效果优化经验的优先考虑,包括排序、召回、离线内容理解等 2. 有 Hadoop、Spark、Flink、HBase 等分布式计算/存储平台上的开发经验,优先考虑 3. 有大规模爬虫经验的优先考虑,包括抓取压力控制、代理、js 渲染、反封禁等
工作职责
1. 参与搜索业务的离线系统与策略的研发 2. 参与大规模网页数据的采集、分析、存储及平台化建设 3. 对夸克搜索海量的网页数据进行处理和挖掘 4. 参与离线系统的性能和稳定性优化
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
JavaScript+
https://developer.mozilla.org/zh-CN/docs/Learn_web_development/Core/Scripting
[英文] Learn JavaScript
https://learnjavascript.online/
The easiest way to learn & practice modern JavaScript
[英文] Learn JavaScript
https://web.dev/learn/javascript
https://www.youtube.com/watch?v=zuKbR4Q428o
Write bulletproof JavaScript code with unit testing!
相关职位
社招3年以上技术-开发
1、负责支付宝APP搜索品研发,能深入理解产品和参与业务,迭代优化产品,负责搜索推荐在线/离线全链路架构设计,支撑业务快速发展; 2、负责搜索推荐领域关键技术预研和技术难点攻关,推进平台化建设,提升迭代效率,构建高可靠性、高性能、高可扩展性的系统; 3、关注AI相关系统前沿方向发展,包括不仅限于生成式AI、机器学习、向量检索、索引、排序、图数据库、大数据技术等,前瞻性地探索新型AI搜索系统架构设计,并推动落地。
更新于 2025-09-28
社招3年以上技术类-开发
1、前瞻性地探索面向AI Native应用的新型AI搜索系统架构设计,构建高可靠性、高性能、高可扩展性的系统,并推动落地; 2、负责AI搜索架构研发,包括生成式搜索、多模态搜索等,基于百亿级大规模数据和大模型技术推进搜索关键技术攻关; 3、负责网页、图片、视频、文档等全网索引数据收录、理解、建库及索引架构设计,构建高时效、高质量、高可用的索引数据架构体系; 4、面向多场景应用,推进搜索平台化建设,提升迭代效率。
更新于 2025-07-09
社招3年以上技术类-数据
1. 主要参与搜索推荐、用户增长、零售等业务数据开发; 2. 参与实时、离线数据链路治理,通过数据治理与质量优化,支持业务提效; 3. 基于对业务理解和产品诉求的抽象,参与到面向业务应用的流批一体数据湖仓架构设计和研发落地; 4. 深入理解电商平台的业务,通过过程性数据分析,持续定位挖掘潜在问题,助力业务发展;
更新于 2025-08-27