YoVDO

Optimal Gradient-Based Algorithms for Non-Concave Bandit Optimization

Offered By: Simons Institute via YouTube

Tags

Algorithm Design Courses Machine Learning Courses Probability Distributions Courses

Course Description

Overview

Explore a 31-minute lecture by Qi Lei from Princeton University on optimal gradient-based algorithms for non-concave bandit optimization. Delve into advanced topics including the stochastic bandit eigenvector problem, noisy power methods, and stochastic low-rank linear rewards. Examine information theoretical understanding, regret comparisons for quadratic rewards, and higher-order polynomial bandit problems. Learn about optimal regret for different types of reward functions and potential extensions to reinforcement learning in simulator settings. Gain insights into cutting-edge research in sampling algorithms and geometries on probability distributions presented at the Simons Institute.

Syllabus

Intro
Bandit Problem
Our focus: beyond linearity and concavity
Problem li the Stochastic Bandit Eigenvector Problem
Some related work
Information theoretical understanding
Beyond cubic dimension dependence
Our methodnoisy power method
Problem i Stochastic Low-rank linear reward
Our algorithm: noisy subspace iteration
Regret comparisons: quadratic reward
Higher-order problems
Problem : Symmetric High-order Polynomial bandit
Problem IV: Asymmetric High-order Polynomial bandit
Lower bound: Optimal dependence on a
Overall Regret Comparisons
Extension to RL in simulator setting
Conclusions We find optimal regret for different types of reward function
Future directions


Taught by

Simons Institute

Related Courses

Natural Language Processing
Columbia University via Coursera
Intro to Algorithms
Udacity
Conception et mise en œuvre d'algorithmes.
École Polytechnique via Coursera
Paradigms of Computer Programming
Université catholique de Louvain via edX
Data Structures and Algorithm Design Part I | 数据结构与算法设计(上)
Tsinghua University via edX