Target-Speaker Methods for Speech Recognition - Overlapping Speech Solutions
Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube
Course Description
Overview
Explore cutting-edge techniques for tackling overlapping speech in multi-talker Automatic Speech Recognition (ASR) applications through this 52-minute talk by Desh Raj from the Center for Language & Speech Processing at Johns Hopkins University. Delve into the world of "target-speaker" methods, starting with a traditional signal processing approach and its new GPU-accelerated implementation that dramatically speeds up meeting transcription. Learn about an innovative project leveraging wake-words for on-device target-speaker ASR, resulting in significant Word Error Rate (WER) reductions. Discover how self-supervised models can be incorporated into this paradigm to further enhance speech recognition capabilities. Gain valuable insights into overcoming challenges in creating effective ASR systems for complex audio environments such as meeting transcription and smart assistants in noisy settings.
Syllabus
Target-speaker Methods for Speech Recognition – Desh Raj
Taught by
Center for Language & Speech Processing(CLSP), JHU
Related Courses
Machine Learning Capstone: An Intelligent Application with Deep LearningUniversity of Washington via Coursera Elaborazione del linguaggio naturale
University of Naples Federico II via Federica Deep Learning for Natural Language Processing
University of Oxford via Independent Deep Learning Summer School
Independent Sequence Models
DeepLearning.AI via Coursera