YoVDO

Stochastic Variance Reduction Methods for Policy Evaluation

Offered By: Simons Institute via YouTube

Tags

Reinforcement Learning Courses Optimization Algorithms Courses Stochastic Gradient Descent Courses

Course Description

Overview

Explore stochastic variance reduction methods for policy evaluation in reinforcement learning through this 47-minute lecture by Lihong Li from Microsoft Research. Delve into the challenges of mean squared projected Bellman error (MSPBE) and its formulation as a saddle-point problem. Examine various gradient-based approaches, including primal dual batch gradient, stochastic gradient descent, and advanced techniques like Stochastic Variance Reduced Gradient (SVRG) and SAGA. Gain insights into complexity analysis, preliminary experiments, and benchmarks using random MDPs and the Mountain Car problem. Compare these methods with previous work and understand their implications for interactive learning in reinforcement learning contexts.

Syllabus

Intro
Reinforcement Learning
Policy Evaluation (PE)
Main Results
Notation
Objective Function for PE
Outline
Challenge with MSPBE
MSPBE as Saddle-Point Problem
Primal Dual Batch Gradient for Low
Stochastic Gradient Descent for L(0,w)
Stochastic Variance Reduced Gradient (SVRG)
SAGA
Extensions
Complexity: Summary
Preliminary Experiments
Experiments: Benchmarks
Random MDPS
Mountain Car
Previous Work
Conclusions


Taught by

Simons Institute

Related Courses

Computational Neuroscience
University of Washington via Coursera
Reinforcement Learning
Brown University via Udacity
Reinforcement Learning
Indian Institute of Technology Madras via Swayam
FA17: Machine Learning
Georgia Institute of Technology via edX
Introduction to Reinforcement Learning
Higher School of Economics via Coursera