阿里巴巴阿里妈妈-引擎研发工程师(java)-北京
社招全职地点:北京状态:招聘
任职要求
1.熟练掌握Java开发和调优、并发编程模式,有丰富Unix/Linux环境开发经验 2.精通大规模高并发互联网应用的设计和开发经验,熟悉常规的分布式架构,熟悉缓存、消息队列等开源中间件 3.有HBase/Hadoop/Spark/Hive/Flink等大数据工具和海量数据加工处理和优化经验优先 4.大学本科及以上学历,有强烈的求知欲和较强的学习能力,对技术充满热情,有搜推广研发经验者优先
工作职责
1.负责阿里妈妈人群定向引擎建设,支持上亿维标签千亿级人群的高性能匹配计算,持续提升在离线引擎链路的效率、稳定性、健壮性 2.负责广告计算平台设计和开发,支持万亿级数据的交互式圈人、洞察、归因、报表场景 3.能够结合业界先进思想进行技术攻坚和突破,驱动业务发展
包括英文材料
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
多线程+
https://liaoxuefeng.com/books/java/threading/basic/index.html
和单线程相比,多线程编程的特点在于:多线程经常需要读写共享数据,并且需要同步。
https://www.youtube.com/watch?v=_uQgGS_VIXM&list=PLsc-VaxfZl4do3Etp_xQ0aQBoC-x5BIgJ
https://www.youtube.com/watch?v=IEEhzQoKtQU
https://www.youtube.com/watch?v=mTGdtC9f4EU&list=PLL8woMHwr36EDxjUoCzboZjedsnhLP1j4
https://www.youtube.com/watch?v=TPVH_coGAQs&list=PLk6CEY9XxSIAeK-EAh3hB4fgNvYkYmghp
https://www.youtube.com/watch?v=xPqnoB2hjjA
This video is an introduction to multithreading in modern C++.
https://www.youtube.com/watch?v=YKBwKy5PrpQ
Rust threading is easy to implement and improves the efficiency of your applications on multi-core systems!
Java+
https://www.youtube.com/watch?v=eIrMbAQSU34
Master Java – a must-have language for software development, Android apps, and more! ☕️ This beginner-friendly course takes you from basics to real coding skills.
Unix+
[英文] The UNIX® Standard
https://www.opengroup.org/membership/forums/platform/unix
https://www.youtube.com/watch?v=IrDUcdpPmdI
UNIX is an operating system which was first developed in the 1970s, and has been under constant development ever since.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
高并发+
https://www.baeldung.com/concurrency-principles-patterns
In this tutorial, we’ll discuss some of the design principles and patterns that have been established over time to build highly concurrent applications.
https://www.baeldung.com/java-concurrency
Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.
https://www.oreilly.com/library/view/concurrency-in-go/9781491941294/
You’ll understand how Go chooses to model concurrency, what issues arise from this model, and how you can compose primitives within this model to solve problems.
https://www.oreilly.com/library/view/modern-concurrency-in/9781098165406/
With this book, you'll explore the transformative world of Java 21's key feature: virtual threads.
https://www.youtube.com/watch?v=qyM8Pi1KiiM
https://www.youtube.com/watch?v=wEsPL50Uiyo
缓存+
https://hackernoon.com/the-system-design-cheat-sheet-cache
The cache is a layer that stores a subset of data, typically the most frequently accessed or essential information, in a location quicker to access than its primary storage location.
https://www.youtube.com/watch?v=bP4BeUjNkXc
Caching strategies, Distributed Caching, Eviction Policies, Write-Through Cache and Least Recently Used (LRU) cache are all important terms when it comes to designing an efficient system with a caching layer.
https://www.youtube.com/watch?v=dGAgxozNWFE
消息队列+
https://www.youtube.com/watch?v=xErwDaOc-Gs
中间件+
https://www.youtube.com/watch?v=1oWPUpMheGk
HBase+
[英文] HBase Tutorial
https://www.tutorialspoint.com/hbase/index.htm
HBase is a data model that is similar to Google's big table designed to provide quick random access to huge amounts of structured data. This tutorial provides an introduction to HBase, the procedures to set up HBase on Hadoop File Systems, and ways to interact with HBase shell.
Hadoop+
https://www.runoob.com/w3cnote/hadoop-tutorial.html
Hadoop 为庞大的计算机集群提供可靠的、可伸缩的应用层计算和存储支持,它允许使用简单的编程模型跨计算机群集分布式处理大型数据集,并且支持在单台计算机到几千台计算机之间进行扩展。
[英文] Hadoop Tutorial
https://www.tutorialspoint.com/hadoop/index.htm
Hadoop is an open-source framework that allows to store and process big data in a distributed environment across clusters of computers using simple programming models.
Spark+
[英文] Learning Spark Book
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf
This new edition has been updated to reflect Apache Spark’s evolution through Spark 2.x and Spark 3.0, including its expanded ecosystem of built-in and external data sources, machine learning, and streaming technologies with which Spark is tightly integrated.
Hive+
[英文] Hive Tutorial
https://www.tutorialspoint.com/hive/index.htm
Hive is a data warehouse infrastructure tool to process structured data in Hadoop. It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.
https://www.youtube.com/watch?v=D4HqQ8-Ja9Y
Flink+
https://nightlies.apache.org/flink/flink-docs-release-2.0/docs/learn-flink/overview/
This training presents an introduction to Apache Flink that includes just enough to get you started writing scalable streaming ETL, analytics, and event-driven applications, while leaving out a lot of (ultimately important) details.
https://www.youtube.com/watch?v=WajYe9iA2Uk&list=PLa7VYi0yPIH2GTo3vRtX8w9tgNTTyYSux
Today’s businesses are increasingly software-defined, and their business processes are being automated. Whether it’s orders and shipments, or downloads and clicks, business events can always be streamed. Flink can be used to manipulate, process, and react to these streaming events as they occur.
大数据+
https://www.youtube.com/watch?v=bAyrObl7TYE
https://www.youtube.com/watch?v=H4bf_uuMC-g
With all this talk of Big Data, we got Rebecca Tickle to explain just what makes data into Big Data.
学历+
相关职位
社招A67994
1、设计和实现高效、实时,可靠的分布式计算引擎,服务于字节跳动公司的数十个产品线的推荐和广告业务; 2、设计与开发分布式计算平台以及机器学习训练作业调度系统和框架,从稳定性、性能和功能等多方面进行优化; 3、构建高效,稳定的集群资源管理系统,在资源隔离,提升资源利用率方面进行优化; 4、深入理解和支撑业务高速发展,抽象出新的架构方案通用解决业务问题。
更新于 2024-08-15
社招3年以上核心本地商业-点
1. 主导与参与搜索、推荐相关业务和算法系统的设计与开发; 2. 通过架构抽象和优化,提升工程、算法、产品的迭代效率; 3. 通过合理的技术选型和实践,优化计算和存储资源效率; 4. 深入理解搜索或推荐业务,与产品及算法合作推动产品探索和前沿算法落地。
更新于 2025-09-24
社招3年以上核心本地商业-业
参与美团核心本地商业的统一搜索引擎建设,具体包含以下方面: 1. 支撑搜推核心业务的统一检索引擎建设,包含核心存储引擎建设(正倒排索引、向量化检索、KV存储等)、检索引 擎建设(SQL化的查询引擎)、索引构建系统、在线召回系统。 2. 负责流批一体的数据处理系统建设,支撑美团全业务线供给、数百数据源、百亿数据的接入处理、实时索引构建和更新。 3. 从离线数据处理、索引构建、在线召回的一站式平台研发,支持业务的一站式迭代、支持大规模存储服务的自动化运维、auto-resharding、弹性伸缩、离在线混布。
更新于 2025-06-22