YoVDO

Scaling LLM Test-Time Compute Optimally for Improved Performance

Offered By: Yannic Kilcher via YouTube

Tags

Scaling Laws Courses Prompt Engineering Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive analysis of scaling inference-time computation in Large Language Models (LLMs) through this in-depth video presentation. Delve into the research paper that investigates how LLMs can improve their performance by utilizing additional test-time computation. Examine two primary mechanisms for scaling test-time computation: searching against dense, process-based verifier reward models and updating the model's distribution over a response adaptively. Discover how the effectiveness of different approaches varies depending on prompt difficulty, leading to the development of a "compute-optimal" scaling strategy. Learn how this strategy can improve test-time compute efficiency by more than 4x compared to a best-of-N baseline. Gain insights into the implications of these findings for LLM pretraining and the trade-offs between inference-time and pre-training compute. Understand how, in certain scenarios, test-time compute can be leveraged to outperform significantly larger models in a FLOPs-matched evaluation.

Syllabus

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)


Taught by

Yannic Kilcher

Related Courses

Introduction To Mechanical Micro Machining
Indian Institute of Technology, Kharagpur via Swayam
Biomaterials - Intro to Biomedical Engineering
Udemy
OpenAI Whisper - Robust Speech Recognition via Large-Scale Weak Supervision
Aleksa Gordić - The AI Epiphany via YouTube
Turbulence as Gibbs Statistics of Vortex Sheets - Alexander Migdal
Institute for Advanced Study via YouTube
City Analytics - Professor Peter Grindrod CBE
Alan Turing Institute via YouTube