YoVDO

KDD 2020- Learning by Exploration-Part 2

Offered By: Association for Computing Machinery (ACM) via YouTube

Tags

Reinforcement Learning Courses Recommendation Systems Courses

Course Description

Overview

Explore the second part of a conference talk from KDD 2020 focusing on learning by exploration. Delve into advanced concepts such as graph Laplacian regularization, social influence modeling, adaptive user clustering, and Particle Thompson Sampling. Examine techniques for addressing cold start challenges, encoding user dependencies, and leveraging historical data for model warm-starting. Investigate context-dependent clustering methods, probabilistic matrix factorization, and online Bayesian parameter estimation. Consider the problem-related regret lower bounds and evaluate the effectiveness of current algorithms in utilizing problem structure information.

Syllabus

Intro
Build independent LinUCB for each user? . Cold start challenge • Users are not independent
Connected users are assumed to share similar model parameters • Graph Laplacan based regularization upon ridge regression to model dependency
Graph Laplacian based regularization upon ridge regression to model dependency • Encode graph Laplaclan in context formulate as a di dimensional LIUCB
Social influence among users. content and opinion sharing in social network W • Reward: weighted average of expected reward among friends
Adaptively cluster users into groups by keep removing edges
item clustering • Each item cluster is associated with its own user clustering
Context-dependent clustering . For current user i, find neighboring user set /for every candidate item X. . Then aggregate the history rewards/ predictions within the user cluster.
Particle Thompson Sampling (PTS) [KBKTC15] • Probabilistic Matrix Factorization framework • Particle filtering for online Bayesian parameter estimation • Thompson Sampling for exploration
Alternating Least Squares for optimization • Exploration considers uncertainty from two factors
Leverage historical data to warm start model, reduce the need of exploration
What is the problem-related (structure-related) regret lower bound . Eg, user dependency structure, low rank, offline data • Did current algorithms fully utilize the information in problem structure?


Taught by

Association for Computing Machinery (ACM)

Related Courses

Computational Neuroscience
University of Washington via Coursera
Reinforcement Learning
Brown University via Udacity
Reinforcement Learning
Indian Institute of Technology Madras via Swayam
FA17: Machine Learning
Georgia Institute of Technology via edX
Introduction to Reinforcement Learning
Higher School of Economics via Coursera