A Theory for Emergence of Complex Skills in Language Models

Offered By: Simons Institute via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore a groundbreaking lecture on the emergence of complex skills in language models presented by Sanjeev Arora from Princeton University. Delve into the fascinating world of Large Language Models and Transformers, examining the poorly understood phenomenon of new skills emerging as parameter sets and training corpora are scaled up. Discover a novel approach that analyzes emergence using empirical Scaling Laws of LLMs and a simple statistical framework. Learn about the contributions of this research, including a statistical framework relating cross-entropy loss to competence on basic language task skills, mathematical analysis revealing a strong form of inductive bias called "slingshot generalization," and an example demonstrating how competence in executing tasks involving k-tuples of skills emerges at the same scaling and rate as elementary skills. Gain valuable insights into this cutting-edge research that challenges conventional generalization theory and offers new perspectives on the capabilities of language models.

Syllabus

A Theory for Emergence of Complex Skills in Language Models

Taught by

Simons Institute

A Theory for Emergence of Complex Skills in Language Models

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

A Theory for Emergence of Complex Skills in Language Models

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue