YoVDO

Online Learning in Markov Decision Processes - Part 2

Offered By: Simons Institute via YouTube

Tags

Reinforcement Learning Courses Markov Decision Processes Courses Online Learning Courses Function Approximation Courses

Course Description

Overview

Explore the intricacies of online learning in Markov Decision Processes (MDPs) in this comprehensive lecture from the Theory of Reinforcement Learning Boot Camp. Delve into topics such as adversarial scenarios, performance measures, oblivious and non-oblivious adversaries, and the challenges of learning with changing transitions. Examine the formal protocol for online learning in fixed MDPs, temporal dependencies, and regret decomposition. Discover the MDP-Expert algorithm, its guarantees, and applications in bandit feedback scenarios. Investigate online linear optimization, mirror descent, and the Online Relative Entropy Policy Search (O-REPS) algorithm. Compare various guarantees and explore function approximation in MDP-E. Gain insights into O-REPS with uncertain models and consider future directions in this field of study.

Syllabus

Intro
MARKOV DECISION PROCESSES
ADVERSARIAL
PERFORMANCE MEASURE: RE
OUTLINE
NON-OBLIVIOUS ADVERSARI
WHAT WENT WRONG?
OBLIVIOUS ADVERSARIES
LEARNING WITH CHANGING TRANSITIONS IS HARD
PROOF CONSTRUCTION
SLOWLY CHANGING MDPS
FORMAL PROTOCOL Online learning in a fixed MDP For each round t = 1,2, ..., • Learner observes state X, EX
TEMPORAL DEPENDENCES
REGRET DECOMPOSITION
THE DRIFT TERMS
LOCAL-TO-GLOBAL
THE MDP-EXPERT ALGORITHE
GUARANTEES FOR MDP-E
BANDIT FEEDBACK
ONLINE LINEAR OPTIMIZATIO
ONLINE MIRROR DESCENT
THE ONLINE REPS ALGORITH O-REPS
GUARANTEES FOR O-REPS
COMPARISON OF GUARANTE
MDP-E WITH FUNCTION APPROXIMATION MDP-E only needs a good approximation of the action-value
O-REPS WITH UNCERTAIN MO
OUTLOOK


Taught by

Simons Institute

Related Courses

E-learning and Digital Cultures
University of Edinburgh via Coursera
Construcción de un Curso Virtual en la Plataforma Moodle
Universidad de San Martín de Porres via Miríadax
Teaching Computing: Part 2
University of East Anglia via FutureLearn
Learning Design
University of Leicester via EMMA
Nuevos escenarios de aprendizaje digital
University of the Basque Country via Miríadax