YoVDO

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

Offered By: Launchpad via YouTube

Tags

Robotics Courses Machine Learning Courses Computer Vision Courses Reinforcement Learning Courses Data Augmentation Courses Vision-Language Models Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover the groundbreaking LLaRA framework in this 16-minute video presentation by the Fellowship.ai team. Delve into the innovative approach of enhancing robotic action policy through Large Language Models (LLMs) and Vision-Language Models (VLMs). Learn how LLaRA formulates robot actions as conversation-style instruction-response pairs and improves decision-making by incorporating auxiliary data. Explore the process of training VLMs with visual-textual prompts and the automated pipeline for generating high-quality robotics instruction data from existing behavior cloning datasets. Gain insights into how this framework enables optimal policy decisions for robotic tasks, showcasing state-of-the-art performance in both simulated and real-world environments. Access the code, datasets, and pretrained models on GitHub to further your understanding of this cutting-edge AI innovation in robot learning.

Syllabus

Fellowship: LLaRA, Supercharging Robot Learning Data for Vision-Language Policy


Taught by

Launchpad

Related Courses

TensorFlow を使った畳み込みニューラルネットワーク
DeepLearning.AI via Coursera
Emotion AI: Facial Key-points Detection
Coursera Project Network via Coursera
Transfer Learning for Food Classification
Coursera Project Network via Coursera
Facial Expression Classification Using Residual Neural Nets
Coursera Project Network via Coursera
Apply Generative Adversarial Networks (GANs)
DeepLearning.AI via Coursera