YoVDO

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube

Tags

Convolution Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a groundbreaking approach to multi-channel multi-talker automatic speech recognition (ASR) in this 40-minute conference talk from the Center for Language & Speech Processing at JHU. Delve into the innovative technique of convolving overlapping speech signals with room impulse responses (RIR) to create a novel spatial feature called RIR-SF. Discover how this method outperforms the state-of-the-art 3D spatial feature, achieving a 21.3% relative reduction in Character Error Rate (CER) for multi-channel multi-talker ASR systems. Learn about the robustness of RIR-SF in highly reverberant environments and its potential to overcome limitations of existing approaches. Gain insights into the theoretical analysis and experimental results that demonstrate the superiority of this new spatial feature in addressing ongoing challenges in the speech recognition community.

Syllabus

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR


Taught by

Center for Language & Speech Processing(CLSP), JHU

Related Courses

Computational Photography
Georgia Institute of Technology via Udacity
Discrete Time Signals and Systems, Part 1: Time Domain
Rice University via edX
Signals and Systems, Part 1
Indian Institute of Technology Bombay via edX
Discrete Time Signals and Systems, Part 2: Frequency Domain
Rice University via edX
Introduction to Sound and Acoustic Sketching
University St. Joseph via Kadenze