YoVDO

On the Hardness of Reinforcement Learning With Value-Function Approximation

Offered By: Simons Institute via YouTube

Tags

Reinforcement Learning Courses Deep Learning Courses Markov Decision Processes Courses

Course Description

Overview

Explore the complexities of Reinforcement Learning with value-function approximation in this 54-minute lecture by Nan Jiang from the University of Illinois Urbana-Champaign. Delve into the applications of RL, compare it with Supervised Learning, and understand the intricacies of Markov Decision Processes. Examine batch learning in MDPs, including examples from video game playing, and analyze the assumptions on data and MDP dynamics. Investigate algorithms for batch RL and learn how things can go wrong with restricted classes. Discover the importance of strong assumptions like "completeness" and why realizability alone may be insufficient. Follow attempts to prove a key conjecture and grasp its significance in the field. This talk, part of the "Emerging Challenges in Deep Learning" series at the Simons Institute, offers valuable insights into the hardness of RL with value-function approximation.

Syllabus

Intro
Reinforcement Learning (RL) Applications
Value-function Approximation
Comparison between SL and RL
Markov Decision Process (MDP)
Batch learning in MDPS
Example: Video game playing
Batch learning in large MDPS
Assumption on data (?)
Assumption on data & MDP dynamics
Algorithm for batch RL
How things go wrong (w/ restricted class)
Fix using a strong assumption ("completeness")
Realizability alone is insufficient?
Proving the conjecture: Attempt 1
Checklist for a plausible construction
Importance of the conjecture
Importance of the construction


Taught by

Simons Institute

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera
Leading Ambitious Teaching and Learning
Microsoft via edX