YoVDO

Creating Voiceovers with OpenAI's Text-to-Speech and Vision Models

Offered By: Ian Wootten via YouTube

Tags

OpenAI Courses Computer Vision Courses Text to Speech Courses Audio generation Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the latest advancements in OpenAI's text-to-speech (TTS) and GPT-4V models in this 15-minute video tutorial. Discover innovative applications of these technologies, including generating image descriptions and creating audio content. Learn how to produce voiceovers for images and videos using a combination of TTS and GPT-4V. Follow along as the presenter demonstrates practical examples and showcases novel ways developers have been utilizing these powerful tools. Gain insights into the potential of AI-driven content creation and enhance your understanding of cutting-edge language and vision models.

Syllabus

Intro
Using TTS to create audio
Using GPT4V to describe images
Using TTS & GPT4V for Video voiceovers
Conclusion


Taught by

Ian Wootten

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Computational Photography
Georgia Institute of Technology via Coursera
Einführung in Computer Vision
Technische Universität München (Technical University of Munich) via Coursera
Introduction to Computer Vision
Georgia Institute of Technology via Udacity