Stochastic Variance Reduction Methods for Policy Evaluation

Offered By: Simons Institute via YouTube

Course Description

Overview

Explore stochastic variance reduction methods for policy evaluation in reinforcement learning through this 47-minute lecture by Lihong Li from Microsoft Research. Delve into the challenges of mean squared projected Bellman error (MSPBE) and its formulation as a saddle-point problem. Examine various gradient-based approaches, including primal dual batch gradient, stochastic gradient descent, and advanced techniques like Stochastic Variance Reduced Gradient (SVRG) and SAGA. Gain insights into complexity analysis, preliminary experiments, and benchmarks using random MDPs and the Mountain Car problem. Compare these methods with previous work and understand their implications for interactive learning in reinforcement learning contexts.

Syllabus

Intro
Reinforcement Learning
Policy Evaluation (PE)
Main Results
Notation
Objective Function for PE
Outline
Challenge with MSPBE
MSPBE as Saddle-Point Problem
Primal Dual Batch Gradient for Low
Stochastic Gradient Descent for L(0,w)
Stochastic Variance Reduced Gradient (SVRG)
SAGA
Extensions
Complexity: Summary
Preliminary Experiments
Experiments: Benchmarks
Random MDPS
Mountain Car
Previous Work
Conclusions

Taught by

Simons Institute

Stochastic Variance Reduction Methods for Policy Evaluation

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Stochastic Variance Reduction Methods for Policy Evaluation

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue