Decoder-Only Transformers, ChatGPT's Specific Transformer, Clearly Explained
Offered By: StatQuest with Josh Starmer via YouTube
Course Description
Overview
Syllabus
Transformers are taking over AI right now, and quite possibly their most famous use is in ChatGPT. ChatGPT uses a specific type of Transformer called a Decoder-Only Transformer, and this StatQuest shows you how they work, one step at a time. And at the end at , we talk about the differences between a Normal Transformer and a Decoder-Only Transformer. BAM!
Awesome song and introduction
Word Embedding
Position Encoding
Masked Self-Attention, an Autoregressive method
Residual Connections
Generating the next word in the prompt
Review of encoding and generating the prompt
Generating the output, Part 1
Masked Self-Attention while generating the output
Generating the output, Part 2
Normal Transformers vs Decoder-Only Transformers
Taught by
StatQuest with Josh Starmer
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Artificial Intelligence for Robotics
Stanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent