YoVDO

Hindsight Learning for MDPs with Exogenous Inputs

Offered By: GERAD Research Center via YouTube

Tags

Markov Decision Processes Courses Cloud Computing Courses Sequential Decision Making Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 51-minute DS4DM Coffee Talk on Hindsight Learning for Markov Decision Processes (MDPs) with Exogenous Inputs, presented by Sean Sinclair from MIT. Dive into the world of sequential decision-making under uncertainty, focusing on resource management problems where exogenous variables outside the decision-maker's control affect outcomes. Learn about Exo-MDPs and the innovative class of data-efficient algorithms called Hindsight Learning (HL). Discover how HL algorithms achieve efficiency by leveraging past decisions to infer counterfactual consequences, accelerating policy improvements. Compare HL against classic baselines in multi-secretary and airline revenue management problems. Examine the scalability of these algorithms in a critical cloud resource management scenario: allocating Virtual Machines (VMs) to physical machines, with simulations using real datasets from a major public cloud provider. Gain insights into how HL algorithms outperform domain-specific heuristics and state-of-the-art reinforcement learning methods in various applications.

Syllabus

Hindsight Learning for MDPs with Exogenous Inputs, Sean Sinclair


Taught by

GERAD Research Center

Related Courses

Toward Generalizable Embodied AI for Machine Autonomy
Bolei Zhou via YouTube
What Are the Statistical Limits of Offline Reinforcement Learning With Function Approximation?
Simons Institute via YouTube
Better Learning from the Past - Counterfactual - Batch RL
Simons Institute via YouTube
Off-Policy Policy Optimization
Simons Institute via YouTube
Provably Efficient Reinforcement Learning with Linear Function Approximation - Chi Jin
Institute for Advanced Study via YouTube