优酷优酷-语音算法专家-杭州/北京
社招全职3年以上地点:北京 | 杭州状态:招聘
任职要求
1、在语音合成、音乐生成、文生音频、视频生音频等方面有相关的项目经验,并能够对其中某一领域的算法深入研发并努力创新; 2、熟悉传统机器学习基础理论,熟悉深度学习开源框架,深入理解CNN/RNN/VAE/GAN/Transformer/Diffusion等模型原理,掌握至少…
登录查看完整任职要求
微信扫码,1秒登录
工作职责
应用声纹识别、音频理解、音频增强等音频技术处理海量音频数据,紧跟业界前沿技术方向,参与构建生成式音频系统,从文本、视频、音频等多模态输入出发,研发高自然度、风格多样、可控性强的语音、音乐与音效生成模型,期待追求卓越、自我驱动、聪明乐观的优秀人士加入虎鲸文娱集团,共同开创影视工业化的商业新格局。 具体职责包括但不限于: 1、负责基于海量数据和复杂业务场景,和团队其他成员一起构建面向真实影视剧场景的一流音频合成系统,共同推动技术产品化与商业化; 2、负责音频合成相关算法设计,覆盖以下一个或多个方向: ꔷ 高拟人度、情绪丰富的语音合成,包括Emotional TTS、Speaker Recognition、Instant Voice Clone等模块的优化与落地; ꔷ 跨模态音频生成,包括Video-to-Audio、Text-to-Audio等方向的模型优化与落地; ꔷ 端到端的音乐生成,包括Lyric-to-Song, CoT, ICL等模块的优化与落地; 3、跟踪业界前沿技术和方法,持续探索音频合成技术的新能力和新应用,解决资源受限场景的实际问题,持续提升音频合成核心能力
包括英文材料
算法+
https://roadmap.sh/datastructures-and-algorithms
Step by step guide to learn Data Structures and Algorithms in 2025
https://www.hellointerview.com/learn/code
A visual guide to the most important patterns and approaches for the coding interview.
https://www.w3schools.com/dsa/
机器学习+
https://www.youtube.com/watch?v=0oyDqO8PjIg
Learn about machine learning and AI with this comprehensive 11-hour course from @LunarTech_ai.
https://www.youtube.com/watch?v=i_LwzRVP7bg
Learn Machine Learning in a way that is accessible to absolute beginners.
https://www.youtube.com/watch?v=NWONeJKn6kc
Learn the theory and practical application of machine learning concepts in this comprehensive course for beginners.
https://www.youtube.com/watch?v=PcbuKRNtCUc
Learn about all the most important concepts and terms related to machine learning and AI.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
LSTM+
https://colah.github.io/posts/2015-08-Understanding-LSTMs/
Humans don’t start their thinking from scratch every second.
https://d2l.ai/chapter_recurrent-modern/lstm.html
The term “long short-term memory” comes from the following intuition.
https://developer.nvidia.com/discover/lstm
A Long short-term memory (LSTM) is a type of Recurrent Neural Network specially designed to prevent the neural network output for a given input from either decaying or exploding as it cycles through the feedback loops.
https://www.youtube.com/watch?v=YCzL96nL7j0
Basic recurrent neural networks are great, because they can handle different amounts of sequential data, but even relatively small sequences of data can make them difficult to train.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
还有更多 •••
相关职位
社招1年以上技术类-算法
1.负责声学前端、声学模型、语言模型、后处理、解码器等主要模块的迭代和改进; 2.负责优化双工交互系统,提高系统的鲁棒性和性能; 3.负责优化语音识别大模型、流式语音识别、音频理解大模型、语音端到端大模型等; 4.追踪业界前沿的语音技术,探索语音大模型在业务场景下的应用。
更新于 2025-10-28北京|杭州
社招2年以上技术类-算法
1. 负责语音合成、语音识别、端到端语音交互大模型算法的基础研究和应用落地; 2. 参与语音合成与识别技术在业务场景落地,解决落地过程中的前沿问题,持续优化语音合成与识别核心技术效果; 3. 深入调研和关注音频/NLP/多模态/全模态等方向的前沿技术,持续探索语音技术的新能力和新应用。
更新于 2025-09-19北京|杭州
社招3年以上技术类-算法
1、语音合成文本分析,韵律预测,注音等技术研发; 2、熟悉常见的声学模型和声码器,具有相关的开发和研究经验 3、熟悉声音转换相关算法和技术; 4、熟悉通用合成引擎搭建及优化,具备云上和端上引擎优化经验。 5、深入调研和关注音频/NLP/多模态等方向的前沿技术,持续探索语音合成技术的新能力和新应用。
更新于 2025-10-28北京|杭州