
Soul APP语音算法工程师
社招全职地点:北京状态:招聘
任职要求
岗位要求 1. 计算机、信号、数学或统计学等相关专业方向的硕士及硕士以上; 2. 熟悉常用语音合成框架,如:FastSpeech、Tacotron、VITS、HifiGAN、VALLE等,有一个或多个开发和应用经验; 3. 熟悉Transformer、LLM、diffusion等原理,对超自然语音生成、多模态语音生成有一定的理解或应用; 4. 熟练使用pytorch/tensorflow深度学习框架,具备Python/C/C++编程功底,应用经验丰富; 5. 具备较强的自学能力和独立思考能力,善于思考和表达自己的想法,具备良好的团队合作精神; 加分项 1. 在相关国际会议或主流期刊上发表论文者优先 2. 掌握多模态技术者优先
工作职责
岗位职责 1. 负责语音合成、语音克隆、双工语音通话等语音生成相关技术的数据和模型开发,并协助业务落地; 2. 负责持续跟进业界前沿算法发展方向,支持公司在核心技术上的影响力发展。
包括英文材料
Transformer+
https://huggingface.co/learn/llm-course/en/chapter1/4
Breaking down how Large Language Models work, visualizing how data flows through.
https://poloclub.github.io/transformer-explainer/
An interactive visualization tool showing you how transformer models work in large language models (LLM) like GPT.
https://www.youtube.com/watch?v=wjZofJX0v4M
Breaking down how Large Language Models work, visualizing how data flows through.
大模型+
https://www.youtube.com/watch?v=xZDB1naRUlk
You will build projects with LLMs that will enable you to create dynamic interfaces, interact with vast amounts of text data, and even empower LLMs with the capability to browse the internet for research papers.
https://www.youtube.com/watch?v=zjkBMFhNj_g
PyTorch+
https://datawhalechina.github.io/thorough-pytorch/
PyTorch是利用深度学习进行数据科学研究的重要工具,在灵活性、可读性和性能上都具备相当的优势,近年来已成为学术界实现深度学习算法最常用的框架。
https://www.youtube.com/watch?v=V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
TensorFlow+
https://www.youtube.com/watch?v=tpCFfeUEGs8
Ready to learn the fundamentals of TensorFlow and deep learning with Python? Well, you’ve come to the right place.
https://www.youtube.com/watch?v=ZUKz4125WNI
This part continues right where part one left off so get that Google Colab window open and get ready to write plenty more TensorFlow code.
深度学习+
https://d2l.ai/
Interactive deep learning book with code, math, and discussions.
Python+
https://liaoxuefeng.com/books/python/introduction/index.html
中文,免费,零起点,完整示例,基于最新的Python 3版本。
https://www.learnpython.org/
a free interactive Python tutorial for people who want to learn Python, fast.
https://www.youtube.com/watch?v=K5KVEU3aaeQ
Master Python from scratch 🚀 No fluff—just clear, practical coding skills to kickstart your journey!
https://www.youtube.com/watch?v=rfscVS0vtbw
This course will give you a full introduction into all of the core concepts in python.
C+
https://www.freecodecamp.org/chinese/news/the-c-beginners-handbook/
本手册遵循二八定律。你将在 20% 的时间内学习 80% 的 C 编程语言。
https://www.youtube.com/watch?v=87SH2Cn0s9A
https://www.youtube.com/watch?v=KJgsSFOSQv0
This course will give you a full introduction into all of the core concepts in the C programming language.
https://www.youtube.com/watch?v=PaPN51Mm5qQ
In this complete C programming course, Dr. Charles Severance (aka Dr. Chuck) will help you understand computer architecture and low-level programming with the help of the classic C Programming language book written by Brian Kernighan and Dennis Ritchie.
C+++
https://www.learncpp.com/
LearnCpp.com is a free website devoted to teaching you how to program in modern C++.
https://www.youtube.com/watch?v=ZzaPdXTrSb8
相关职位
校招研发类
1、负责参与语音算法能力构建,包括不限于语音识别、声学模型、语言模型、热词技术、语音合成、音频鉴伪等; 2、负责语音领域算法压缩量化、推理加速、小型化部署; 3、跟踪语音算法领域的前沿技术规划,参与核心算法与系统方案在业务的落地。
更新于 2025-08-08
社招3年以上
1、负责基于传统/AI方案的AEC、语音降噪、阵列算法等音频信号处理算法的研发; 2、负责算法在具体业务场景、项目上的验证、优化和落地工作, 将其部署于端侧平台; 3、跟进行业技术趋势和行业动态,研究新兴技术,保持团队技术领先性。
更新于 2024-10-08

校招AI 算法类
1. 负责语音识别理解和语音生成算法、建模方案研究,推动应用落地。 2. 侧重于研发超写实语音合成、情感可控语音合成、零资源语音克隆(zero-shot TTS)、语音转换和音频音乐生成等生成类算法, 以及融合语音识别、语音翻译、说话人识别、音频分析、语音分离等多任务模型算法,推动技术在同声翻译、数字人对话等场景中应用。 3. 探索音频模态和LLM的结合,实现语音识别、音频理解、语音生成、语音转换、音乐生成和音效生成的统一建模方案,并推动落地应用。 4. 通过跟踪和创新,确保算法技术的行业领先。 5. 持续关注学术界和行业的最新研究动态,参与国际会议、研讨会,与全球顶级团队进行交流合作。