YoVDO

From Large Language Models to Large Multimodal Models - Stanford CS25 - Lecture 4

Offered By: Stanford University via YouTube

Tags

Artificial Intelligence Courses Machine Learning Courses Computer Vision Courses Generative Models Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the evolution from large language models to large multimodal models in this Stanford University lecture. Delve into the basics of large language models and examine the academic community's efforts in developing multimodal models over the past year. Learn about CogVLM, a powerful open-source multimodal model with 17B parameters, and CogAgent, designed for GUI and OCR scenarios. Discover applications of multimodal models and potential research directions in academia. Speaker Ming Ding, a research scientist at Zhipu AI, shares insights on multimodal generative models, understanding models, and language models. Gain valuable knowledge about the integration of visual perception with language model capabilities in this 1 hour and 20 minute presentation from the Stanford CS25 Transformers United series.

Syllabus

Stanford CS25: V4 I From Large Language Models to Large Multimodal Models


Taught by

Stanford Online

Tags

Related Courses

Visual Recognition & Understanding
University at Buffalo via Coursera
Deep Learning for Computer Vision
IIT Hyderabad via Swayam
Deep Learning in Life Sciences - Spring 2021
Massachusetts Institute of Technology via YouTube
Advanced Deep Learning Methods for Healthcare
University of Illinois at Urbana-Champaign via Coursera
Generative Models
Serrano.Academy via YouTube