Multimodal Machine Learning for Human-Centric Tasks
Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube
Course Description
Overview
Explore cutting-edge multimodal architectures for human-centric tasks in this 48-minute seminar from the Center for Language & Speech Processing at Johns Hopkins University. Delve into novel approaches for handling missing modalities, identifying noisy features, and leveraging unlabeled data in multimodal systems. Learn about strategies combining auxiliary networks, transformer architectures, and optimized training mechanisms for robust audiovisual emotion recognition. Discover a deep learning framework with gating layers to improve audiovisual automatic speech recognition by mitigating the impact of noisy visual features. Examine methods to utilize unlabeled multimodal data, including carefully designed pretext tasks and multimodal ladder networks with lateral connections. Gain insights into enhancing generalization and robustness in speech-processing tasks using these advanced multimodal architectures.
Syllabus
Multimodal Machine Learning for Human-Centric Tasks
Taught by
Center for Language & Speech Processing(CLSP), JHU
Related Courses
Linear CircuitsGeorgia Institute of Technology via Coursera مقدمة في هندسة الطاقة والقوى
King Abdulaziz University via Rwaq (رواق) Magnetic Materials and Devices
Massachusetts Institute of Technology via edX Linear Circuits 2: AC Analysis
Georgia Institute of Technology via Coursera Transmisión de energía eléctrica
Tecnológico de Monterrey via edX