Do Androids Know They're Only Dreaming of Electric Sheep?
Offered By: USC Information Sciences Institute via YouTube
Course Description
Overview
Explore the intricacies of hallucination detection in transformer language models through this 59-minute lecture presented by Sky Wang from Columbia University at the USC Information Sciences Institute. Delve into the design of probes trained on internal representations to predict hallucinatory behavior in in-context generation tasks. Examine the creation of a span-annotated dataset for organic and synthetic hallucinations across various tasks. Discover the ecological validity challenges of probes trained on synthetic hallucinations for organic hallucination detection. Analyze how hidden state information about hallucination varies across tasks and distributions. Investigate the differences in intrinsic and extrinsic hallucination saliency across layers, hidden state types, and tasks. Learn about the potential of probing as an efficient alternative to language model hallucination evaluation when model states are available. Gain insights from Sky Wang, a Ph.D. candidate in Computer Science at Columbia University, whose research focuses on Natural Language Processing, Computational Social Science, and mechanistic interpretability.
Syllabus
Do Androids Know They’re Only Dreaming of Electric Sheep?
Taught by
USC Information Sciences Institute
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Artificial Intelligence for Robotics
Stanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent