YoVDO

Transformer Neural Networks, ChatGPT's Foundation, Clearly Explained

Offered By: StatQuest with Josh Starmer via YouTube

Tags

Self-Attention Courses Artificial Intelligence Courses Parallel Computing Courses Word Embeddings Courses Positional Encoding Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into a comprehensive 36-minute video explanation of Transformer Neural Networks, the foundation of cutting-edge AI technologies like ChatGPT and Google Translate. Learn about word embedding, positional encoding, self-attention mechanisms, and the encoder-decoder architecture. Explore how Transformers are designed for parallel computing and understand the decoding process. Gain insights into additional components that can enhance Transformer performance. Supplementary links are provided for deeper understanding of related concepts such as backpropagation, SoftMax function, and cosine similarity.

Syllabus

Awesome song and introduction
Word Embedding
Positional Encoding
Self-Attention
Encoder and Decoder defined
Decoder Word Embedding
Decoder Positional Encoding
Transformers were designed for parallel computing
Decoder Self-Attention
Encoder-Decoder Attention
Decoding numbers into words
Decoding the second token
Extra stuff you can add to a Transformer


Taught by

StatQuest with Josh Starmer

Related Courses

Transformers: Text Classification for NLP Using BERT
LinkedIn Learning
TensorFlow: Working with NLP
LinkedIn Learning
TransGAN - Two Transformers Can Make One Strong GAN - Machine Learning Research Paper Explained
Yannic Kilcher via YouTube
Nyströmformer- A Nyström-Based Algorithm for Approximating Self-Attention
Yannic Kilcher via YouTube
Recreate Google Translate - Model Training
Edan Meyer via YouTube