Language Model Alignment: Theory and Algorithms
Offered By: Simons Institute via YouTube
Course Description
Overview
Explore the intricacies of language model alignment in this comprehensive lecture by Ahmad Beirami from Google. Delve into the post-training process aimed at generating samples from an aligned distribution to enhance rewards such as safety and factuality while minimizing divergence from the base model. Examine the best-of-N baseline technique and more advanced methods solving KL-regularized reinforcement learning problems. Gain insights into key results through simplified examples and discover a novel modular alignment approach called controlled decoding. Learn how this technique solves the KL-regularized RL problem while maintaining a frozen base model through prefix scorer learning, offering inference-time configurability. Analyze the surprising effectiveness of best-of-N in achieving competitive or superior reward-KL tradeoffs compared to state-of-the-art alignment baselines.
Syllabus
Language Model Alignment: Theory & Algorithms
Taught by
Simons Institute
Related Courses
Launching into Machine Learning 日本語版Google Cloud via Coursera Launching into Machine Learning auf Deutsch
Google Cloud via Coursera Launching into Machine Learning en Français
Google Cloud via Coursera Launching into Machine Learning en Español
Google Cloud via Coursera Основы машинного обучения
Higher School of Economics via Coursera