Transformer Neural Networks, ChatGPT's Foundation, Clearly Explained
Offered By: StatQuest with Josh Starmer via YouTube
Course Description
Overview
Dive into a comprehensive 36-minute video explanation of Transformer Neural Networks, the foundation of cutting-edge AI technologies like ChatGPT and Google Translate. Learn about word embedding, positional encoding, self-attention mechanisms, and the encoder-decoder architecture. Explore how Transformers are designed for parallel computing and understand the decoding process. Gain insights into additional components that can enhance Transformer performance. Supplementary links are provided for deeper understanding of related concepts such as backpropagation, SoftMax function, and cosine similarity.
Syllabus
Awesome song and introduction
Word Embedding
Positional Encoding
Self-Attention
Encoder and Decoder defined
Decoder Word Embedding
Decoder Positional Encoding
Transformers were designed for parallel computing
Decoder Self-Attention
Encoder-Decoder Attention
Decoding numbers into words
Decoding the second token
Extra stuff you can add to a Transformer
Taught by
StatQuest with Josh Starmer
Related Courses
NeRF - Representing Scenes as Neural Radiance Fields for View SynthesisYannic Kilcher via YouTube Perceiver - General Perception with Iterative Attention
Yannic Kilcher via YouTube LambdaNetworks- Modeling Long-Range Interactions Without Attention
Yannic Kilcher via YouTube Attention Is All You Need - Transformer Paper Explained
Aleksa Gordić - The AI Epiphany via YouTube NeRFs- Neural Radiance Fields - Paper Explained
Aladdin Persson via YouTube