滴滴资深语音算法工程师(J250529004)
社招全职3-5年技术地点:北京状态:招聘
任职要求
1、电子、计算机或相关声学、信号处理专业毕业,具备一定语音信号处理基础 2、熟悉Pytorch框架,良好的编程能力,熟练使用python编程语言,具备Linux平台开发经验 3、3-5年语音识别、音频事件检测、声纹识别、语音合成等算法经验 4、在ICASSP、Interspeech、ASRU等语音顶会或国际竞赛有论文发表或优异成绩优先 5、ACM竞赛取得优异成绩或有优秀C++编程能力优先 6、有大型语音识别、语音合成项目经验者优先 Qualifications 1. Bachelor’s or higher degree in electronics, computer science, acoustics, signal processing, or related fields, with a solid foundation in speech signal processing. 2. Proficient in PyTorch framework, strong programming skills, fluency in Python, and experience in Linux platform development. 3. 3-5 years of hands-on experience in algorithms such as speech recognition, audio event detection, speaker recognition, and speech synthesis. 4. Candidates with publications or outstanding achievements in top-tier speech conferences (e.g., ICASSP, Interspeech, ASRU) or international competitions are preferred. 5. Candidates with excellent ACM competition results or strong C++ programming skills are preferred. 6. Experience in large-scale speech recognition or speech synthesis projects is a big plus.
工作职责
1、负责语音理解和语音生成算法在滴滴场景的落地使用 2、跟进最新技术,结合业务场景,提升语音识别、音频事件检测、声纹识别、语音合成等算法效果 3、探索语音大模型或多模态大模型在语音理解及语音生成场景的应用范式 4、算法优化,从模型架构、推理框架、量化压缩等角度提升模型推理速度、降低推理成本 Job Description 1. Responsible for the implementation of speech understanding and speech generation algorithms in Didi’s business scenarios. 2. Stay updated with the latest technologies and improve the performance of algorithms such as speech recognition, audio event detection, speaker recognition in real-world applications. 3. Explore the application paradigms of large language models or multimodal models in speech understanding and generation scenarios. 4. Optimize algorithms by enhancing inference speed and reducing costs through improvements in frameworks and quantization
包括英文材料
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
Linux+
https://ryanstutorials.net/linuxtutorial/
Ok, so you want to learn how to use the Bash command line interface (terminal) on Unix/Linux.
https://ubuntu.com/tutorials/command-line-for-beginners
The Linux command line is a text interface to your computer.
https://www.youtube.com/watch?v=6WatcfENsOU
In this Linux crash course, you will learn the fundamental skills and tools you need to become a proficient Linux system administrator.
https://www.youtube.com/watch?v=v392lEyM29A
Never fear the command line again, make it fear you.
https://www.youtube.com/watch?v=ZtqBQ68cfJc
语音识别+
https://www.youtube.com/watch?v=mYUyaKmvu6Y
Learn how to implement speech recognition in Python by building five projects.
https://www.youtube.com/watch?v=sR6_bZ6VkAg
How Rev.com harnesses human-in-the-loop and deep learning to build the world's best English speech recognition engine
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
开发框架+
[英文] Understanding Modern Development Frameworks: A Guide for Developers and Technical Decision-makers
https://www.freecodecamp.org/news/understanding-modern-development-frameworks-guide-for-devs/
相关职位
社招3-5年技术
1、负责语音理解和语音生成算法在滴滴场景的落地使用 2、跟进最新技术,结合业务场景,提升语音识别、音频事件检测、声纹识别、语音合成等算法效果 3、探索语音大模型或多模态大模型在语音理解及语音生成场景的应用范式
更新于 2025-10-09
社招3年以上技术团队AI &
1) 负责文本和语音智能客服中相关NLP等算法的实现和深度应用; 技术点如:语义匹配、多轮对话、智能问答、信息抽取、语音算法、推荐算法等; 2) 负责客服系统的全面智能化改造,提升系统运行效率和稳定性; 3)参与AI新技术研究和应用方案落地,包括大模型,虚拟人等。
更新于 2025-05-28