Modeling Human Preference to Improve LLM Performance
Offered By: Snorkel AI via YouTube
Course Description
Overview
Explore a 20-minute conference talk by Hoang Tran, Machine Learning Engineer at Snorkel AI, on improving language model performance through human preference modeling. Learn about the development of reward models trained to mimic human annotator preferences and their application in accepting or rejecting base model responses. Discover how this approach significantly enhances model performance with minimal end-user guidance compared to traditional feedback methods. Gain insights into aligning language models with human preferences, direct preference optimization, and programmatic scaling of human preferences. Access accompanying slides, a summary of Snorkel AI's Enterprise LLM Summit, and related video recordings to deepen your understanding of this innovative approach in machine learning and generative AI.
Syllabus
Introduction
Aligning language models with human preferences
Direct preference optimization
Programmatically scale human preferences
Results
Taught by
Snorkel AI
Related Courses
Solving the Last Mile Problem of Foundation Models with Data-Centric AIMLOps.community via YouTube Foundational Models in Enterprise AI - Challenges and Opportunities
MLOps.community via YouTube Knowledge Distillation Demystified: Techniques and Applications
Snorkel AI via YouTube Model Distillation - From Large Models to Efficient Enterprise Solutions
Snorkel AI via YouTube Curate Training Data via Labeling Functions - 10 to 100x Faster
Snorkel AI via YouTube