RWKV- Reinventing RNNs for the Transformer Era
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore an in-depth analysis of the Receptance Weighted Key Value (RWKV) model, a groundbreaking architecture that bridges the gap between Transformers and Recurrent Neural Networks (RNNs). Delve into the evolution of linear attention mechanisms, RWKV's layer structure, and its ability to combine efficient parallelizable training with streamlined inference. Examine experimental results, limitations, and visualizations that demonstrate RWKV's performance compared to similarly sized Transformers. Gain insights into this innovative approach that reconciles computational efficiency and model performance in sequence processing tasks, potentially shaping the future of natural language processing.
Syllabus
- Introduction
- Fully Connected In-Person Conference in SF June 7th
- Transformers vs RNNs
- RWKV: Best of both worlds
- LSTMs
- Evolution of RWKV's Linear Attention
- RWKV's Layer Structure
- Time-Parallel vs Sequence Mode
- Experimental Results & Limitations
- Visualizations
- Conclusion
Taught by
Yannic Kilcher
Related Courses
Reinforcement Learning for Trading StrategiesNew York Institute of Finance via Coursera Natural Language Processing with Sequence Models
DeepLearning.AI via Coursera Fake News Detection with Machine Learning
Coursera Project Network via Coursera English/French Translator: Long Short Term Memory Networks
Coursera Project Network via Coursera Text Classification Using Word2Vec and LSTM on Keras
Coursera Project Network via Coursera