YoVDO

Multimodal Machine Learning for Human-Centric Tasks

Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube

Tags

Deep Learning Courses Computer Vision Courses Transformers Courses Feature Selection Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore cutting-edge multimodal architectures for human-centric tasks in this 48-minute seminar from the Center for Language & Speech Processing at Johns Hopkins University. Delve into novel approaches for handling missing modalities, identifying noisy features, and leveraging unlabeled data in multimodal systems. Learn about strategies combining auxiliary networks, transformer architectures, and optimized training mechanisms for robust audiovisual emotion recognition. Discover a deep learning framework with gating layers to improve audiovisual automatic speech recognition by mitigating the impact of noisy visual features. Examine methods to utilize unlabeled multimodal data, including carefully designed pretext tasks and multimodal ladder networks with lateral connections. Gain insights into enhancing generalization and robustness in speech-processing tasks using these advanced multimodal architectures.

Syllabus

Multimodal Machine Learning for Human-Centric Tasks


Taught by

Center for Language & Speech Processing(CLSP), JHU

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera
Leading Ambitious Teaching and Learning
Microsoft via edX