Recurrent Neural Networks, Transformers, and Attention - MIT 6.S191 Lecture 2
Offered By: Alexander Amini via YouTube
Course Description
Overview
Dive into the world of advanced deep learning techniques in this comprehensive lecture from MIT's Introduction to Deep Learning course. Explore the intricacies of Recurrent Neural Networks (RNNs), including their architecture, intuition, and implementation from scratch. Discover the challenges of sequence modeling and learn about backpropagation through time. Investigate gradient issues and understand how Long Short-Term Memory (LSTM) networks address these problems. Delve into the fundamentals of attention mechanisms, their relationship to search, and how they're implemented in neural networks. Examine real-world applications of RNNs and attention-based models, and gain insights into scaling attention for various tasks. By the end of this lecture, you'll have a solid understanding of these advanced deep learning concepts and their practical applications in the field.
Syllabus
- Introduction
- Sequence modeling
- Neurons with recurrence
- Recurrent neural networks
- RNN intuition
- Unfolding RNNs
- RNNs from scratch
- Design criteria for sequential modeling
- Word prediction example
- Backpropagation through time
- Gradient issues
- Long short term memory LSTM
- RNN applications
- Attention fundamentals
- Intuition of attention
- Attention and search relationship
- Learning attention with neural networks
- Scaling attention and applications
- Summary
Taught by
https://www.youtube.com/@AAmini/videos
Related Courses
Applied Deep Learning: Build a Chatbot - Theory, ApplicationUdemy Can Wikipedia Help Offline Reinforcement Learning? - Paper Explained
Yannic Kilcher via YouTube Infinite Memory Transformer - Research Paper Explained
Yannic Kilcher via YouTube Recurrent Neural Networks and Transformers
Alexander Amini via YouTube MIT 6.S191 - Recurrent Neural Networks
Alexander Amini via YouTube