Superposition in LLM Feature Representations
Offered By: Conf42 via YouTube
Course Description
Overview
Explore the concept of superposition in large language model feature representations in this 47-minute conference talk from Conf42 LLMs 2024. Delve into mechanistic interpretability, neural network representations, and the qualities of these representations. Examine decomposability and linearity in depth, including linear composition as a compression scheme and its demands. Investigate the linear representation puzzle and neuron-feature requirements before diving into the superposition hypothesis. Analyze sparsity and learn techniques for recovering features in superposition. Conclude with a discussion on feature exploration in large language models.
Syllabus
intro
preamble
mechanistic interpretability
neural network representations
qualities of representations
decomposability
linearity
linear composition as a compression scheme
demands of linearity
the linear representation puzzle
neuron - feature requirements
experience with llms
the superposition hypothesis
sparsity
recovering features in superposition
demands of linearity
feature exploration
thanks
Taught by
Conf42
Related Courses
The Quantum Internet and Quantum Computers: How Will They Change the World?Delft University of Technology via edX Vibrations and Waves
Massachusetts Institute of Technology via edX Fundamentos de Oscilaciones y Ondas para Ingeniería
Universitat Politècnica de València via edX Introduction to Quantum Computing for Everyone
The University of Chicago via edX Introduction to Quantum Computing
LinkedIn Learning