YoVDO

Synthetic Data - Friend or Foe in the Age of Scaling?

Offered By: Institut des Hautes Etudes Scientifiques (IHES) via YouTube

Tags

Synthetic Data Courses Artificial Intelligence Courses Machine Learning Courses Neural Networks Courses Transformers Courses Scaling Laws Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the impact of synthetic data on AI and large language model scaling in this 56-minute lecture by Julia Kempe from the Institut des Hautes Etudes Scientifiques (IHES). Delve into the theoretical framework of model collapse through scaling laws, examining how the increasing presence of synthesized data in training corpora affects model improvement and performance. Discover various decay phenomena, including loss of scaling, shifted scaling across generations, skill "un-learning," and grokking when combining human and synthetic data. Learn about the validation of this theory through large-scale experiments using a transformer for arithmetic tasks and the LLM Llama2 for text generation. Gain insights into the potential future challenges and implications for AI development as synthetic data becomes more prevalent in training datasets.

Syllabus

Julia Kempe - Synthetic Data – Friend or Foe in the Age of Scaling?


Taught by

Institut des Hautes Etudes Scientifiques (IHES)

Related Courses

Introduction To Mechanical Micro Machining
Indian Institute of Technology, Kharagpur via Swayam
Biomaterials - Intro to Biomedical Engineering
Udemy
OpenAI Whisper - Robust Speech Recognition via Large-Scale Weak Supervision
Aleksa Gordić - The AI Epiphany via YouTube
Turbulence as Gibbs Statistics of Vortex Sheets - Alexander Migdal
Institute for Advanced Study via YouTube
City Analytics - Professor Peter Grindrod CBE
Alan Turing Institute via YouTube