YoVDO

Stochastic Variance Reduction Methods for Policy Evaluation

Offered By: Simons Institute via YouTube

Tags

Reinforcement Learning Courses Optimization Algorithms Courses Stochastic Gradient Descent Courses

Course Description

Overview

Explore stochastic variance reduction methods for policy evaluation in reinforcement learning through this 47-minute lecture by Lihong Li from Microsoft Research. Delve into the challenges of mean squared projected Bellman error (MSPBE) and its formulation as a saddle-point problem. Examine various gradient-based approaches, including primal dual batch gradient, stochastic gradient descent, and advanced techniques like Stochastic Variance Reduced Gradient (SVRG) and SAGA. Gain insights into complexity analysis, preliminary experiments, and benchmarks using random MDPs and the Mountain Car problem. Compare these methods with previous work and understand their implications for interactive learning in reinforcement learning contexts.

Syllabus

Intro
Reinforcement Learning
Policy Evaluation (PE)
Main Results
Notation
Objective Function for PE
Outline
Challenge with MSPBE
MSPBE as Saddle-Point Problem
Primal Dual Batch Gradient for Low
Stochastic Gradient Descent for L(0,w)
Stochastic Variance Reduced Gradient (SVRG)
SAGA
Extensions
Complexity: Summary
Preliminary Experiments
Experiments: Benchmarks
Random MDPS
Mountain Car
Previous Work
Conclusions


Taught by

Simons Institute

Related Courses

Deep Learning for Natural Language Processing
University of Oxford via Independent
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
DeepLearning.AI via Coursera
Deep Learning Part 1 (IITM)
Indian Institute of Technology Madras via Swayam
Deep Learning - Part 1
Indian Institute of Technology, Ropar via Swayam
Logistic Regression with Python and Numpy
Coursera Project Network via Coursera