Superposition in LLM Feature Representations

Offered By: Conf42 via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore the concept of superposition in large language model feature representations in this 47-minute conference talk from Conf42 LLMs 2024. Delve into mechanistic interpretability, neural network representations, and the qualities of these representations. Examine decomposability and linearity in depth, including linear composition as a compression scheme and its demands. Investigate the linear representation puzzle and neuron-feature requirements before diving into the superposition hypothesis. Analyze sparsity and learn techniques for recovering features in superposition. Conclude with a discussion on feature exploration in large language models.

Syllabus

intro
preamble
mechanistic interpretability
neural network representations
qualities of representations
decomposability
linearity
linear composition as a compression scheme
demands of linearity
the linear representation puzzle
neuron - feature requirements
experience with llms
the superposition hypothesis
sparsity
recovering features in superposition
demands of linearity
feature exploration
thanks

Taught by

Conf42

Superposition in LLM Feature Representations

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Superposition in LLM Feature Representations

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue