Grokking - Generalization Beyond Overfitting on Small Algorithmic Datasets
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore the fascinating phenomenon of "grokking" in neural networks through this in-depth video explanation of a research paper. Delve into how neural networks can suddenly learn patterns in small algorithmic datasets, jumping from random chance to perfect generalization. Examine the emergence of underlying binary operations in learned latent spaces, and investigate factors influencing grokking. Gain insights into the role of smoothness, the importance of simple explanations, and the impact of weight decay on encouraging simplicity in neural network learning. Follow along with a comprehensive outline covering key topics such as double descent, binary operations datasets, and emerging structures in neural network learning.
Syllabus
- Intro & Overview
- The Grokking Phenomenon
- Related: Double Descent
- Binary Operations Datasets
- What quantities influence grokking?
- Learned Emerging Structure
- The role of smoothness
- Simple explanations win
- Why does weight decay encourage simplicity?
- Appendix
- Conclusion & Comments
Taught by
Yannic Kilcher
Related Courses
Generative Adversarial Networks - ComputerphileComputerphile via YouTube Interpreting Deep Generative Models for Interactive AI Content Creation
Bolei Zhou via YouTube Exploring and Exploiting Interpretable Semantics in GANs - CVPR 2020 Tutorial
Bolei Zhou via YouTube Autoencoders Explained Easily
Valerio Velardo - The Sound of AI via YouTube All Things VQGAN - AutoEncoder with Latent Space and Reconstruction Error - Part 1
Prodramp via YouTube