YoVDO

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Offered By: Simons Institute via YouTube

Tags

Reinforcement Learning Courses Deep Learning Courses Markov Decision Processes Courses Approximation Courses Policy Gradient Methods Courses

Course Description

Overview

Explore the intricacies of policy gradient methods in Markov Decision Processes through this 55-minute lecture by Alekh Agarwal from Microsoft Research Redmond. Delve into optimality and approximation concepts as part of the "Emerging Challenges in Deep Learning" series at the Simons Institute. Examine MDP preliminaries, policy parameterizations, and the policy gradient algorithm, with a focus on softmax parameterization and entropy regularization. Analyze the convergence of entropy-regularized PGA, natural solutions, and proof ideas. Investigate restricted parameterizations, natural policy gradient updates, policy assumptions, and extensions to finite samples. Gain valuable insights into this crucial area of deep learning and reinforcement learning research.

Syllabus

Intro
Questions of interest
Main challenges
MDP Preliminaries
Policy parameterizations
Policy gradient algorithm
Policy gradient example: Softmax parameterization
Entropy regularization
Convergence of Entropy regularized PG
A natural solution
Proof ideas
Restricted parameterizations
A closer look at Natural Policy Gradient • NPG performs the update
Assumptions on policies
Extension to finite samples
Looking ahead


Taught by

Simons Institute

Related Courses

Deep Learning and Python Programming for AI with Microsoft Azure
Cloudswyft via FutureLearn
Advanced Artificial Intelligence on Microsoft Azure: Deep Learning, Reinforcement Learning and Applied AI
Cloudswyft via FutureLearn
Overview of Advanced Methods of Reinforcement Learning in Finance
New York University (NYU) via Coursera
AI for Cybersecurity
Johns Hopkins University via Coursera
人工智慧:機器學習與理論基礎 (Artificial Intelligence - Learning & Theory)
National Taiwan University via Coursera