Leveraging Pre-training Models for Speech Processing
Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube
Course Description
Overview
Explore the cutting-edge advancements in speech processing through pre-training models in this comprehensive 2-hour and 59-minute lecture from the Center for Language & Speech Processing (CLSP) at Johns Hopkins University. Delve into the crucial role of pre-training in advancing speech, natural language processing (NLP), and computer vision (CV) research. Discover how pre-trained models, leveraging vast amounts of unlabeled data, can be transferred to multiple downstream applications with remarkable efficiency. Examine benchmarks like SUPERB, LeBenchmark, and NOSS, which evaluate pre-trained models' generalizability across various speech and audio processing tasks. Learn how small players in the field can benefit from publicly available pre-trained models, achieving impressive performance with minimal additional training. Investigate research directions including modeling prosody, efficient leveraging of pre-trained models in downstream tasks, robustness under domain mismatch, incorporation of visual information, and development of computationally efficient models. Gain insights into open-source benchmarks, tools, and techniques that drive the frontier of network pre-training technology in speech processing.
Syllabus
Leveraging Pre-training Models for Speech Processing
Taught by
Center for Language & Speech Processing(CLSP), JHU
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Computational Photography
Georgia Institute of Technology via Coursera Einführung in Computer Vision
Technische Universität München (Technical University of Munich) via Coursera Introduction to Computer Vision
Georgia Institute of Technology via Udacity