YoVDO

Building Generalist Robotics Policies from Scratch

Offered By: Montreal Robotics via YouTube

Tags

Robotics Courses Machine Learning Courses Computer Vision Courses Neural Networks Courses Transformers Courses Data Preprocessing Courses Vision Transformers Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into a comprehensive video tutorial on building Generalist Robotics Policies from scratch. Learn how to implement the "Octo: An Open-Source Generalist Robot Policy" model step-by-step, starting with basic transformer code and progressing to training the model using data from the open-x embodiment dataset. Explore topics such as data exploration, dataset creation, transformer encoder implementation, image patch tokenization, and Vision Transformer (ViT) construction. Discover techniques for incorporating text inputs, handling continuous and discrete actions, standardizing state inputs and action spaces, and integrating goal images into the transformer architecture. Gain insights into scaling training processes, analyzing results across A100 GPUs, and evaluating the model using the SimpleEnv robotics simulator. Access accompanying code, project details, and additional resources to enhance your understanding of Generalist Robotics Policies and their applications in the field of robotics.

Syllabus

Intro: ChatGPT, Language Models and the Goals of Generalist Robotics Policies
Reading and exploring the data
Creating a Dataset
Creating a Dataset
Creating the transformer encoder
Creating image patches to tokenized
Putting together the VIT
Training the VIT
Making the GRP, starting with adding text inputs
Modifying the data for training
Converting continuous actions to discrete bins
Converting continuous actions to discrete bins
Standardizing the state inputs
Changing to use continuous actions
Standizing the action space
Adding goal images to the transformer
Adding blocked masked attention to use either goal
Scaling training
Training results across A100s
Evaluation using the SimpleEnv robotics simulator


Taught by

Montreal Robotics

Related Courses

Vision Transformers Explained + Fine-Tuning in Python
James Briggs via YouTube
ConvNeXt- A ConvNet for the 2020s - Paper Explained
Aleksa Gordić - The AI Epiphany via YouTube
Do Vision Transformers See Like Convolutional Neural Networks - Paper Explained
Aleksa Gordić - The AI Epiphany via YouTube
Stable Diffusion and Friends - High-Resolution Image Synthesis via Two-Stage Generative Models
HuggingFace via YouTube
Intro to Dense Vectors for NLP and Vision
James Briggs via YouTube