YoVDO

Introduction to Multimodal Prompting for Generative AI

Offered By: LinkedIn Learning

Tags

Computer Vision Courses GPT-4 Courses Generative AI Courses Speech Recognition Courses Gemini Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how you can leverage modern AI systems that utilize multimodality.

Syllabus

Introduction
  • GenAI with multimodal prompts
1. Multimodality
  • What is multimodality?
  • Visual modality
  • Textual and auditory modality
2. GPT-4
  • GPT-4 and 4o
  • Text to image in GPT-4
  • GPT-4 API with various input types
  • Challenge: Drawing to code
  • Solution: Drawing to code
3. Gemini
  • What is Gemini?
  • Images in Gemini
  • Gemini video inputs
  • Challenge: Video narration
  • Solution: Video narration
4. Auditory Modalities
  • Audio in generative AI
  • Prompt and audio
  • Generating music
  • Challenge: Soundtrack creation
  • Solution: Soundtrack creation
Conclusion
  • Next steps

Taught by

Ronnie Sheer

Related Courses

Learn Google Bard and Gemini
Udemy
Gemini and the Future of Generative AI Tools - Interview with Simon Tokumine
TensorFlow via YouTube
Gemini and GPT Sales Agents with RAG - Comparison and Implementation
echohive via YouTube
Building a Streamlit Interface for Unified Chat with Multiple LLMs
echohive via YouTube
Gemini 1.5 Pro for Code - Building LLM Agents with CrewAI
Sam Witteveen via YouTube