Deep Learning Meets Nonparametric Regression: Are Weight-decayed DNNs Locally Adaptive?
Offered By: USC Information Sciences Institute via YouTube
Course Description
Overview
Syllabus
Intro
From the statistical point of view, the success of DNN is a mystery.
Why do Neural Networks work better?
The "adaptivity" conjecture
NTKs are strictly suboptimal for locally adaptive nonparametric regression
Are DNNs locally adaptive? Can they achieve optimal rates for TV-classes/Besov classes?
Background: Splines are piecewise polynomials
Background: Truncated power basis for splines
Weight decay = Total Variation Regularization
Weight decayed L-Layer PNN is equivalent to Sparse Linear Regression with learned basis functions
Main theorem: Parallel ReLU DNN approaches the minimax rates as it gets deeper.
Comparing to classical nonparametric regression methods
Examples of Functions with Heterogeneous Smoothness
Step 2: Approximation Error Bound
Summary of take-home messages
Taught by
USC Information Sciences Institute
Related Courses
Deep Learning with TensorflowIBM via edX Numerical Methods And Simulation Techniques For Scientists And Engineers
Indian Institute of Technology Guwahati via Swayam Scientific Computing using Matlab
Indian Institute of Technology Delhi via Swayam Numerical Computations in MATLAB
Udemy Performing Statistical Analysis with MATLAB
Pluralsight