YoVDO

One of Three Theoretical Puzzles - Generalization in Deep Networks

Offered By: MITCBMM via YouTube

Tags

Deep Learning Courses Neural Networks Courses Gradient Descent Courses

Course Description

Overview

Explore the theoretical puzzle of generalization in deep networks through this comprehensive lecture by Tomaso Poggio from MIT. Delve into key concepts such as how deep networks can overcome the curse of dimensionality for compositional functions, minimizing classification errors and surrogate functions, and the motivation behind generalization bounds for regression. Examine gradient descent as an unconstrained optimization gradient dynamical system, using examples like Lagrange multipliers. Discover how explicit norm constraints lead to weight normalization and why overparameterized networks can fit data while still generalizing well. Gain insights into gradient descent specifically for deep ReLU networks, enhancing your understanding of the theoretical foundations underlying deep learning.

Syllabus

Intro
Deep Networks can avoid the curse of dimensionality for compositional functions
Minimize classification error minimize surrogate function
Motivation: generalization bounds for regression
GD unconstrained optimization gradient dynamical system
Example: Lagrange multiplier
Explicit norm constraint gives weight normalization
Overparametrized networks fit the data and generalize
Gradient Descent for deep RELU networks


Taught by

MITCBMM

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
Good Brain, Bad Brain: Basics
University of Birmingham via FutureLearn
Statistical Learning with R
Stanford University via edX
Machine Learning 1—Supervised Learning
Brown University via Udacity
Fundamentals of Neuroscience, Part 2: Neurons and Networks
Harvard University via edX