Coding a ChatGPT-Like Transformer From Scratch in PyTorch
Offered By: StatQuest with Josh Starmer via YouTube
Course Description
Overview
Walk through the process of coding a ChatGPT-like Transformer from scratch using PyTorch in this comprehensive 31-minute video tutorial. Learn how to load necessary modules, create a training dataset, implement position encoding, code attention mechanisms, and build a decoder-only Transformer. Observe the model running untrained before diving into the training process and practical application. Gain insights into the step-by-step implementation with clear explanations of every detail, assuming prior knowledge of decoder-only Transformers, essential matrix algebra for neural networks, and matrix math behind Transformers.
Syllabus
Awesome song and introduction
Loading the modules
Creating the training dataset
Coding Position Encoding
Coding Attention
Coding a Decoder-Only Transformer
Running the model untrained
Training and using the model
Taught by
StatQuest with Josh Starmer
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent