Recurrent Neural Networks, Transformers, and Attention - MIT 6.S191 Lecture 2

Offered By: Alexander Amini via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Dive into the world of advanced deep learning techniques in this comprehensive lecture from MIT's Introduction to Deep Learning course. Explore the intricacies of Recurrent Neural Networks (RNNs), including their architecture, intuition, and implementation from scratch. Discover the challenges of sequence modeling and learn about backpropagation through time. Investigate gradient issues and understand how Long Short-Term Memory (LSTM) networks address these problems. Delve into the fundamentals of attention mechanisms, their relationship to search, and how they're implemented in neural networks. Examine real-world applications of RNNs and attention-based models, and gain insights into scaling attention for various tasks. By the end of this lecture, you'll have a solid understanding of these advanced deep learning concepts and their practical applications in the field.

Syllabus

- Introduction
- Sequence modeling
- Neurons with recurrence
- Recurrent neural networks
- RNN intuition
- Unfolding RNNs
- RNNs from scratch
- Design criteria for sequential modeling
- Word prediction example
- Backpropagation through time
- Gradient issues
- Long short term memory LSTM
- RNN applications
- Attention fundamentals
- Intuition of attention
- Attention and search relationship
- Learning attention with neural networks
- Scaling attention and applications
- Summary

Taught by

https://www.youtube.com/@AAmini/videos

Recurrent Neural Networks, Transformers, and Attention - MIT 6.S191 Lecture 2

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Recurrent Neural Networks, Transformers, and Attention - MIT 6.S191 Lecture 2

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue