Language Model Alignment: Theory and Algorithms

Offered By: Simons Institute via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore the intricacies of language model alignment in this comprehensive lecture by Ahmad Beirami from Google. Delve into the post-training process aimed at generating samples from an aligned distribution to enhance rewards such as safety and factuality while minimizing divergence from the base model. Examine the best-of-N baseline technique and more advanced methods solving KL-regularized reinforcement learning problems. Gain insights into key results through simplified examples and discover a novel modular alignment approach called controlled decoding. Learn how this technique solves the KL-regularized RL problem while maintaining a frozen base model through prefix scorer learning, offering inference-time configurability. Analyze the surprising effectiveness of best-of-N in achieving competitive or superior reward-KL tradeoffs compared to state-of-the-art alignment baselines.

Syllabus

Language Model Alignment: Theory & Algorithms

Taught by

Simons Institute

Language Model Alignment: Theory and Algorithms

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Language Model Alignment: Theory and Algorithms

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue