Creating Voiceovers with OpenAI's Text-to-Speech and Vision Models

Offered By: Ian Wootten via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore the latest advancements in OpenAI's text-to-speech (TTS) and GPT-4V models in this 15-minute video tutorial. Discover innovative applications of these technologies, including generating image descriptions and creating audio content. Learn how to produce voiceovers for images and videos using a combination of TTS and GPT-4V. Follow along as the presenter demonstrates practical examples and showcases novel ways developers have been utilizing these powerful tools. Gain insights into the potential of AI-driven content creation and enhance your understanding of cutting-edge language and vision models.

Syllabus

Intro
Using TTS to create audio
Using GPT4V to describe images
Using TTS & GPT4V for Video voiceovers
Conclusion

Taught by

Ian Wootten

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Computational Photography
Georgia Institute of Technology via Coursera Einführung in Computer Vision
Technische Universität München (Technical University of Munich) via Coursera Introduction to Computer Vision
Georgia Institute of Technology via Udacity