YoVDO

Deep Dive into the Transformer Encoder Architecture

Offered By: CodeEmporium via YouTube

Tags

Transformer Architecture Courses Deep Learning Courses Neural Networks Courses Embeddings Courses Self-Attention Courses Positional Encoding Courses

Course Description

Overview

Dive deep into the transformer encoder architecture in this 21-minute video tutorial. Explore the intricacies of initial embeddings, positional encodings, and the encoder layer structure. Learn about query, key, and value vectors, self-attention matrix construction, and the importance of scaling and softmax. Understand the combination of attention heads, residual connections, layer normalization, and the role of linear layers, ReLU, and dropout. Conclude with insights on final word embeddings and a sneak peek at the code implementation.

Syllabus

Introduction
Encoder Overview
Blowing up the encoder
Create Initial Embeddings
Positional Encodings
The Encoder Layer Begins
Query, Key, Value Vectors
Constructing Self Attention Matrix
Why scaling and Softmax?
Combining Attention heads
Residual Connections Skip Connections
Layer Normalization
Why Linear Layers, ReLU, Dropout
Complete the Encoder Layer
Final Word Embeddings
Sneak Peak of Code


Taught by

CodeEmporium

Related Courses

NeRF - Representing Scenes as Neural Radiance Fields for View Synthesis
Yannic Kilcher via YouTube
Perceiver - General Perception with Iterative Attention
Yannic Kilcher via YouTube
LambdaNetworks- Modeling Long-Range Interactions Without Attention
Yannic Kilcher via YouTube
Attention Is All You Need - Transformer Paper Explained
Aleksa Gordić - The AI Epiphany via YouTube
NeRFs- Neural Radiance Fields - Paper Explained
Aladdin Persson via YouTube