Stochastic Variance Reduction Methods for Policy Evaluation
Offered By: Simons Institute via YouTube
Course Description
Overview
Explore stochastic variance reduction methods for policy evaluation in reinforcement learning through this 47-minute lecture by Lihong Li from Microsoft Research. Delve into the challenges of mean squared projected Bellman error (MSPBE) and its formulation as a saddle-point problem. Examine various gradient-based approaches, including primal dual batch gradient, stochastic gradient descent, and advanced techniques like Stochastic Variance Reduced Gradient (SVRG) and SAGA. Gain insights into complexity analysis, preliminary experiments, and benchmarks using random MDPs and the Mountain Car problem. Compare these methods with previous work and understand their implications for interactive learning in reinforcement learning contexts.
Syllabus
Intro
Reinforcement Learning
Policy Evaluation (PE)
Main Results
Notation
Objective Function for PE
Outline
Challenge with MSPBE
MSPBE as Saddle-Point Problem
Primal Dual Batch Gradient for Low
Stochastic Gradient Descent for L(0,w)
Stochastic Variance Reduced Gradient (SVRG)
SAGA
Extensions
Complexity: Summary
Preliminary Experiments
Experiments: Benchmarks
Random MDPS
Mountain Car
Previous Work
Conclusions
Taught by
Simons Institute
Related Courses
Deep Learning for Natural Language ProcessingUniversity of Oxford via Independent Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
DeepLearning.AI via Coursera Deep Learning Part 1 (IITM)
Indian Institute of Technology Madras via Swayam Deep Learning - Part 1
Indian Institute of Technology, Ropar via Swayam Logistic Regression with Python and Numpy
Coursera Project Network via Coursera