YoVDO

RWKV- Reinventing RNNs for the Transformer Era

Offered By: Yannic Kilcher via YouTube

Tags

Recurrent Neural Networks (RNN) Courses Long short-term memory (LSTM) Courses Computational Complexity Courses Transformers Courses Neural Network Architecture Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore an in-depth analysis of the Receptance Weighted Key Value (RWKV) model, a groundbreaking architecture that bridges the gap between Transformers and Recurrent Neural Networks (RNNs). Delve into the evolution of linear attention mechanisms, RWKV's layer structure, and its ability to combine efficient parallelizable training with streamlined inference. Examine experimental results, limitations, and visualizations that demonstrate RWKV's performance compared to similarly sized Transformers. Gain insights into this innovative approach that reconciles computational efficiency and model performance in sequence processing tasks, potentially shaping the future of natural language processing.

Syllabus

- Introduction
- Fully Connected In-Person Conference in SF June 7th
- Transformers vs RNNs
- RWKV: Best of both worlds
- LSTMs
- Evolution of RWKV's Linear Attention
- RWKV's Layer Structure
- Time-Parallel vs Sequence Mode
- Experimental Results & Limitations
- Visualizations
- Conclusion


Taught by

Yannic Kilcher

Related Courses

الشبكات العصبية والتعلم العميق
DeepLearning.AI via Coursera
Machine Learning: Create a Neural Network that Predicts whether an Image is a Car or Airplane.
Coursera Project Network via Coursera
Neural Network Programming - Deep Learning with PyTorch
YouTube
Computer Vision with GluonCV (Traditional Chinese)
Amazon Web Services via AWS Skill Builder
Neuronales Netz von Scratch
Coursera Project Network via Coursera