腾讯微信搜索-后台开发工程师-数据流方向
社招全职1年以上搜一搜技术地点:北京状态:招聘
任职要求
1.本科及以上学历,2年及以上工作经验, 计算机相关专业, 有搜索引擎数据流架构研发经验者优先; 2.熟悉C++语言,出色的代码编程能力,良好的数据结构和基础算法功底; 3.熟悉分布式系统,大数据处理相关的技术架构和组件,有Spark、Flink、Pulsar、ClickHouse、IceBerg等实战经验优先,有分布式大数据处理系统的设计实现经验优先; 4.出色的问题分析和解决能力,良好的团队合作意识和沟通能力,乐于挑战,对技术有强烈的热情。
工作职责
1.负责面向图文、视频、账号等多种内容载体的大规模数据接入、特征计算、数据存储和发布平台; 2.通过数据工程技术规范化建设:推动提升数据质量、提升pipeline稳定性、提升平台易用性,提升系统在大规模分布式环境下高并发的处理性能,同时沉淀通用方案和平台工具,提升数据研发效率; 3.支持搜索场景下各类数据特征的处理需求,跟进和引入业界最新技术,打造业界领先的离线数据流架构。
包括英文材料
学历+
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
数据结构+
https://www.youtube.com/watch?v=8hly31xKli0
In this course you will learn about algorithms and data structures, two of the fundamental topics in computer science.
https://www.youtube.com/watch?v=B31LgI4Y4DQ
Learn about data structures in this comprehensive course. We will be implementing these data structures in C or C++.
https://www.youtube.com/watch?v=CBYHwZcbD-s
Data Structures and Algorithms full course tutorial java
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
分布式系统+
https://www.distributedsystemscourse.com/
The home page of a free online class in distributed systems.
https://www.youtube.com/watch?v=7VbL89mKK3M&list=PLOE1GTZ5ouRPbpTnrZ3Wqjamfwn_Q5Y9A
系统设计+
https://roadmap.sh/system-design
Everything you need to know about designing large scale systems.
https://www.youtube.com/watch?v=F2FmTdLtb_4
This complete system design tutorial covers scalability, reliability, data handling, and high-level architecture with clear explanations, real-world examples, and practical strategies.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
Pulsar+
https://pulsar.apache.org/docs/next/functions-develop-tutorial/
Write a function for word count.
https://www.baeldung.com/apache-pulsar
Apache Pulsar is a distributed open source Publication/Subscription based messaging system developed at Yahoo.
https://www.youtube.com/watch?v=TKs5T6N78Tc
Discover the seven key features of Apache Pulsar that make it perfect for providing a centralized messaging & data streaming service for an Enterprise.
ClickHouse+
[英文] Advanced Tutorial
https://clickhouse.com/docs/tutorial
Learn how to ingest and query data in ClickHouse using the New York City taxi example dataset.
https://www.youtube.com/watch?v=FtoWGT7kS-c
ClickHouse is an open-source column-oriented DBMS for online analytical processing that allows users to generate analytical reports using SQL queries in real-time.
https://www.youtube.com/watch?v=Rhe-kUyrFUE&list=PL0Z2YDlm0b3gcY5R_MUo4fT5bPqUQ66ep
Iceberg+
https://iceberg.apache.org/spark-quickstart/
This guide will get you up and running with Apache Iceberg™ using Apache Spark™, including sample code to highlight some powerful features.
https://www.baeldung.com/apache-iceberg-intro
This tutorial will discuss Apache Iceberg, a popular open table format in today’s big data landscape.
https://www.youtube.com/watch?v=TsmhRZElPvM
You’ve probably heard about Apache Iceberg™—after all, it’s been getting a lot of buzz.
相关职位
社招微信技术
1.负责微信搜索内各业务的召回架构研发工作; 2.支持召回架构相关的各项功能开发和性能优化,包括但不限于特征工程、索引建库、在线召回等,持续优化架构性能、稳定性、迭代效率和综合成本。
更新于 2025-04-07