Generative AI for Text to Audio Generation

Offered By: SAIConference via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore cutting-edge advancements in generative AI for text-to-audio generation in this keynote presentation by Professor Wenwu Wang from the University of Surrey. Delve into the evolution of AI technology capable of producing soundscapes from simple text prompts, revolutionizing industries such as filmmaking, game design, virtual reality, and digital media. Learn about the progression from traditional methods to deep learning-based models like AudioLDM, AudioLDM2, Re-AudioLDM, and Wavjourney, and understand how these models map and align text with audio events to create complex audio environments. Discover real-world applications ranging from sound synthesis in gaming and movies to assisting the visually impaired. Gain insights into recent breakthroughs in cross-modal generation, key challenges, and future research directions. Experience live demonstrations and learn how to experiment with these tools on platforms like GitHub and Hugging Face. Key topics covered include an overview of deep generative AI for text-to-audio generation, introduction to key models, practical applications in sound design, and hands-on experimentation with open-source tools.

Syllabus

Generative AI for Text to Audio Generation | Wenwu Wang, University of Surrey | IntelliSys 2024

Taught by

SAIConference

Generative AI for Text to Audio Generation

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Generative AI for Text to Audio Generation

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue