Bandit Algorithm (Online Machine Learning)
Offered By: Indian Institute of Technology Bombay via Swayam
Course Description
Overview
In many scenarios one faces uncertain environments where a-priori the best action to play is unknown. How to obtain best possible reward/utility in such scenarios. One natural way is to first explore the environment and to identify the `best’ actions and exploit them. However, this give raise to an exploration vs exploitation dilemma, where on hand hand we need to do sufficient explorations to identify the best action so that we are confident about its optimality, and on the other hand, best actions need to exploited more number of times to obtain higher reward. In this course we will study many bandit algorithms that balance exploration and exploitation well in various random environment to accumulate good rewards over the duration of play. Bandit algorithms find applications in online advertising, recommendation systems, auctions, routing, e-commerce or in any filed online scenarios where information can be gather in an increment fashion.
INTENDED AUDIENCE :Computer Sceince, Electrical Engineering, Operations Research, Mathematics and Statistics
PREREQUISITES :Basics of Probability Theory and Optimization
INDUSTRIES SUPPORT :All companies related to Internet Technologies (ex. Google, Microsoft, Flipkart, Ola, Amazon, etc.)
INTENDED AUDIENCE :Computer Sceince, Electrical Engineering, Operations Research, Mathematics and Statistics
PREREQUISITES :Basics of Probability Theory and Optimization
INDUSTRIES SUPPORT :All companies related to Internet Technologies (ex. Google, Microsoft, Flipkart, Ola, Amazon, etc.)
Syllabus
COURSE LAYOUT
Week 1:Introduction to Bandit Algorithms. From Batch to Online SettingWeek 2:Adversarial Setting with Full information (Halving, WM Algorithm )Week 3:Adversarial Setting with Bandit InformationWeek 4:Regret lower bounds for adversarial SettingWeek 5:Introduction to Stochastic Setting and various regret notionsWeek 6:A primer on Concentration inequalitiesWeek 7:Stochastic Bandit Algorithms UCB, KL-UCBWeek 8:Lower bounds for stochastic Bandits
Week 9:Introductions to contextual banditsWeek 10:Overview of contextual bandit algorithmsWeek 11:Introduction to pure exploration setups (fixed confidence vs budget)Week 12:Algorithms for pure explorations (LUCB, KL-LUCB, lil’UCB).
Taught by
Prof. Manjesh hanawal
Tags
Related Courses
Анализ данныхNovosibirsk State University via Coursera Approximation Algorithms
EIT Digital via Coursera Basic Statistics
University of Amsterdam via Coursera What are the Chances? Probability and Uncertainty in Statistics
Johns Hopkins University via Coursera Understanding Clinical Research: Behind the Statistics
University of Cape Town via Coursera