LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Offered By: Launchpad via YouTube
Course Description
Overview
Discover the groundbreaking LLaRA framework in this 16-minute video presentation by the Fellowship.ai team. Delve into the innovative approach of enhancing robotic action policy through Large Language Models (LLMs) and Vision-Language Models (VLMs). Learn how LLaRA formulates robot actions as conversation-style instruction-response pairs and improves decision-making by incorporating auxiliary data. Explore the process of training VLMs with visual-textual prompts and the automated pipeline for generating high-quality robotics instruction data from existing behavior cloning datasets. Gain insights into how this framework enables optimal policy decisions for robotic tasks, showcasing state-of-the-art performance in both simulated and real-world environments. Access the code, datasets, and pretrained models on GitHub to further your understanding of this cutting-edge AI innovation in robot learning.
Syllabus
Fellowship: LLaRA, Supercharging Robot Learning Data for Vision-Language Policy
Taught by
Launchpad
Related Courses
TensorFlow を使った畳み込みニューラルネットワークDeepLearning.AI via Coursera Emotion AI: Facial Key-points Detection
Coursera Project Network via Coursera Transfer Learning for Food Classification
Coursera Project Network via Coursera Facial Expression Classification Using Residual Neural Nets
Coursera Project Network via Coursera Apply Generative Adversarial Networks (GANs)
DeepLearning.AI via Coursera