A Theory for Emergence of Complex Skills in Language Models
Offered By: Simons Institute via YouTube
Course Description
Overview
Explore a groundbreaking lecture on the emergence of complex skills in language models presented by Sanjeev Arora from Princeton University. Delve into the fascinating world of Large Language Models and Transformers, examining the poorly understood phenomenon of new skills emerging as parameter sets and training corpora are scaled up. Discover a novel approach that analyzes emergence using empirical Scaling Laws of LLMs and a simple statistical framework. Learn about the contributions of this research, including a statistical framework relating cross-entropy loss to competence on basic language task skills, mathematical analysis revealing a strong form of inductive bias called "slingshot generalization," and an example demonstrating how competence in executing tasks involving k-tuples of skills emerges at the same scaling and rate as elementary skills. Gain valuable insights into this cutting-edge research that challenges conventional generalization theory and offers new perspectives on the capabilities of language models.
Syllabus
A Theory for Emergence of Complex Skills in Language Models
Taught by
Simons Institute
Related Courses
Introduction To Mechanical Micro MachiningIndian Institute of Technology, Kharagpur via Swayam Biomaterials - Intro to Biomedical Engineering
Udemy OpenAI Whisper - Robust Speech Recognition via Large-Scale Weak Supervision
Aleksa Gordić - The AI Epiphany via YouTube Turbulence as Gibbs Statistics of Vortex Sheets - Alexander Migdal
Institute for Advanced Study via YouTube City Analytics - Professor Peter Grindrod CBE
Alan Turing Institute via YouTube