YoVDO

Superposition in LLM Feature Representations

Offered By: Conf42 via YouTube

Tags

Neural Networks Courses Superposition Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the concept of superposition in large language model feature representations in this 47-minute conference talk from Conf42 LLMs 2024. Delve into mechanistic interpretability, neural network representations, and the qualities of these representations. Examine decomposability and linearity in depth, including linear composition as a compression scheme and its demands. Investigate the linear representation puzzle and neuron-feature requirements before diving into the superposition hypothesis. Analyze sparsity and learn techniques for recovering features in superposition. Conclude with a discussion on feature exploration in large language models.

Syllabus

intro
preamble
mechanistic interpretability
neural network representations
qualities of representations
decomposability
linearity
linear composition as a compression scheme
demands of linearity
the linear representation puzzle
neuron - feature requirements
experience with llms
the superposition hypothesis
sparsity
recovering features in superposition
demands of linearity
feature exploration
thanks


Taught by

Conf42

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
Good Brain, Bad Brain: Basics
University of Birmingham via FutureLearn
Statistical Learning with R
Stanford University via edX
Machine Learning 1—Supervised Learning
Brown University via Udacity
Fundamentals of Neuroscience, Part 2: Neurons and Networks
Harvard University via edX