Language Model Alignment: Theory and Algorithms
Offered By: Simons Institute via YouTube
Course Description
Overview
Explore the intricacies of language model alignment in this comprehensive lecture by Ahmad Beirami from Google. Delve into the post-training process aimed at generating samples from an aligned distribution to enhance rewards such as safety and factuality while minimizing divergence from the base model. Examine the best-of-N baseline technique and more advanced methods solving KL-regularized reinforcement learning problems. Gain insights into key results through simplified examples and discover a novel modular alignment approach called controlled decoding. Learn how this technique solves the KL-regularized RL problem while maintaining a frozen base model through prefix scorer learning, offering inference-time configurability. Analyze the surprising effectiveness of best-of-N in achieving competitive or superior reward-KL tradeoffs compared to state-of-the-art alignment baselines.
Syllabus
Language Model Alignment: Theory & Algorithms
Taught by
Simons Institute
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Artificial Intelligence for Robotics
Stanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent