Formal Languages and Automata for Reward Function Specification and Efficient Reinforcement Learning
Offered By: Simons Institute via YouTube
Course Description
Overview
Syllabus
Intro
Acknowledgements
Reinforcement Learning (RL)
Challenges of Real-World RL
Goals and Preferences
Linear Temporal Logic (LTL) A compelling logic to express temporal properties of traces.
Challenges to RL
Toy Problem Disclaimer
Running Example
Decoupling Transition and Reward Functions
The Rest of the Talk
Define a Reward Function using a Reward Machine
Reward Function Vocabulary
Simple Reward Machine
Reward Machines in Action
Other Reward Machines
Q-Learning Baseline
Option-Based Hierarchical RL (HRL)
HRL with RM-Based Pruning (HRL-RM)
HRL Methods Can Find Suboptimal Policies
Q-Learning for Reward Machines (QRM)
QRM In Action
Recall: Methods for Exploiting RM Structure
5. QRM + Reward Shaping (QRM + RS)
Test Domains
Test in Discrete Domains
Office World Experiments
Minecraft World Experiments
Function Approximation with QRM
Water World Experiments
Creating Reward Machines
Reward Specification: one size does not fit all
1. Construct Reward Machine from Formal Languages
Generate RM using a Symbolic Planner
Learn RMs for Partially-Observable RL
Taught by
Simons Institute
Related Courses
Logic: Language and Information 1University of Melbourne via Coursera Logic: Language and Information 2
University of Melbourne via Coursera Language, Proof and Logic
Stanford University via edX 理论计算机科学基础 | Introduction to Theoretical Computer Science
Peking University via edX 离散数学概论 Discrete Mathematics Generality
Peking University via Coursera