YoVDO

Learning Rate Grafting: Transferability of Optimizer Tuning - Machine Learning Research Paper Review

Offered By: Yannic Kilcher via YouTube

Tags

Machine Learning Courses Deep Learning Courses Optimization Algorithms Courses

Course Description

Overview

Explore a comprehensive review of a machine learning research paper on Learning Rate Grafting, focusing on the transferability of optimizer tuning. Delve into the intricacies of various optimization algorithms like SGD, AdaGrad, Adam, LARS, and LAMB, examining their learning rate schedules and gradient direction corrections. Discover the grafting technique, which allows for transferring learning rate schedules between optimizers, and understand its implications for deep learning research. Learn about the experimental results, static transfer of learning rate ratios, and the potential for significant GPU memory savings. Gain insights into the entanglements between optimizers and learning rate schedules, and understand how grafting can provide a robust baseline for optimizer comparisons and reduce computational costs in hyperparameter searches.

Syllabus

- Rant about Reviewer #2
- Intro & Overview
- Adaptive Optimization Methods
- Grafting Algorithm
- Experimental Results
- Static Transfer of Learning Rate Ratios
- Conclusion & Discussion


Taught by

Yannic Kilcher

Related Courses

Deep Learning for Natural Language Processing
University of Oxford via Independent
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
DeepLearning.AI via Coursera
Deep Learning Part 1 (IITM)
Indian Institute of Technology Madras via Swayam
Deep Learning - Part 1
Indian Institute of Technology, Ropar via Swayam
Logistic Regression with Python and Numpy
Coursera Project Network via Coursera