CS885: Multi-Armed Bandits
Offered By: Pascal Poupart via YouTube
Course Description
Overview
Explore the fascinating world of multi-armed bandits in this comprehensive 57-minute lecture by Pascal Poupart. Delve into key concepts such as exploration-exploitation trade-offs, stochastic bandits, and online optimization. Learn about the origins of bandits in gambling and their practical applications. Understand the simplified version of the problem, various heuristics, and the notion of regret. Discover the epsilon-greedy strategy and its implementation in single-state scenarios. Gain insights into different approaches and their effectiveness in real-world situations.
Syllabus
Multiarmed bandits
Exploration exploitation
Stochastic bandits
Bandits from gambling
Bandits in practice
Online optimization
Simplified version
The problem
Heuristics
Notion of regret
Epsilon greedy strategy
Single state
Epsilon greedy
Different approaches
In practice
Taught by
Pascal Poupart
Related Courses
Inside TensorFlow - TF-AgentsTensorFlow via YouTube Provably Efficient Reinforcement Learning with Linear Function Approximation - Chi Jin
Institute for Advanced Study via YouTube Exploration with Limited Memory - Streaming Algorithms for Coin Tossing, Noisy Comparisons, and Multi-Armed Bandits
Association for Computing Machinery (ACM) via YouTube How Slot Machines Are Advancing the State of the Art in Computer Go AI
Churchill CompSci Talks via YouTube Game Theoretic Learning and Spectrum Management - Part 2
IEEE Signal Processing Society via YouTube