RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube
Course Description
Overview
Explore a groundbreaking approach to multi-channel multi-talker automatic speech recognition (ASR) in this 40-minute conference talk from the Center for Language & Speech Processing at JHU. Delve into the innovative technique of convolving overlapping speech signals with room impulse responses (RIR) to create a novel spatial feature called RIR-SF. Discover how this method outperforms the state-of-the-art 3D spatial feature, achieving a 21.3% relative reduction in Character Error Rate (CER) for multi-channel multi-talker ASR systems. Learn about the robustness of RIR-SF in highly reverberant environments and its potential to overcome limitations of existing approaches. Gain insights into the theoretical analysis and experimental results that demonstrate the superiority of this new spatial feature in addressing ongoing challenges in the speech recognition community.
Syllabus
RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR
Taught by
Center for Language & Speech Processing(CLSP), JHU
Related Courses
Computational PhotographyGeorgia Institute of Technology via Udacity Discrete Time Signals and Systems, Part 1: Time Domain
Rice University via edX Signals and Systems, Part 1
Indian Institute of Technology Bombay via edX Discrete Time Signals and Systems, Part 2: Frequency Domain
Rice University via edX Introduction to Sound and Acoustic Sketching
University St. Joseph via Kadenze