RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore a groundbreaking approach to multi-channel multi-talker automatic speech recognition (ASR) in this 40-minute conference talk from the Center for Language & Speech Processing at JHU. Delve into the innovative technique of convolving overlapping speech signals with room impulse responses (RIR) to create a novel spatial feature called RIR-SF. Discover how this method outperforms the state-of-the-art 3D spatial feature, achieving a 21.3% relative reduction in Character Error Rate (CER) for multi-channel multi-talker ASR systems. Learn about the robustness of RIR-SF in highly reverberant environments and its potential to overcome limitations of existing approaches. Gain insights into the theoretical analysis and experimental results that demonstrate the superiority of this new spatial feature in addressing ongoing challenges in the speech recognition community.

Syllabus

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Taught by

Center for Language & Speech Processing(CLSP), JHU

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue