Feedback Transformers - Addressing Some Limitations of Transformers with Feedback Memory
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore the concept of Feedback Transformers in this 44-minute video lecture. Delve into the limitations of autoregressive Transformers in language modeling and discover how Feedback Transformers address these issues. Learn about information flow in recurrent neural networks and Transformers, complex computations with neural networks, and causal masking. Examine the Feedback Transformer architecture, its connection to Attention-RNNs, and its formal definition. Review experimental results demonstrating the improved performance of this approach in language modeling, machine translation, and reinforcement learning tasks. Gain insights into how Feedback Transformers enhance representation capacity, allowing for smaller, shallower models with stronger performance compared to traditional Transformers.
Syllabus
- Intro & Overview
- Problems of Autoregressive Processing
- Information Flow in Recurrent Neural Networks
- Information Flow in Transformers
- Solving Complex Computations with Neural Networks
- Causal Masking in Transformers
- Missing Higher Layer Information Flow
- Feedback Transformer Architecture
- Connection to Attention-RNNs
- Formal Definition
- Experimental Results
- Conclusion & Comments
Taught by
Yannic Kilcher
Related Courses
الشبكات العصبية والتعلم العميقDeepLearning.AI via Coursera Machine Learning: Create a Neural Network that Predicts whether an Image is a Car or Airplane.
Coursera Project Network via Coursera Neural Network Programming - Deep Learning with PyTorch
YouTube Computer Vision with GluonCV (Traditional Chinese)
Amazon Web Services via AWS Skill Builder Neuronales Netz von Scratch
Coursera Project Network via Coursera