Controlling Distribution Shifts in Language Models: A Data-Centric Approach
Offered By: Simons Institute via YouTube
Course Description
Overview
Explore a lecture on controlling distribution shifts in language models through data-centric approaches. Delve into Tatsunori Hashimoto's presentation from Stanford University, part of the Emerging Generalization Settings series at the Simons Institute. Examine the challenges of cross-task and cross-domain generalization in NLP, focusing on the trade-offs between generalization and control in language model pretraining. Discover two complementary strategies: algorithmic data filtering to prioritize benchmark-relevant training data and domain adaptation through large-scale synthesis of domain-specific pretraining data. Gain insights into addressing the gaps between pretraining and target evaluation caused by distribution shifts in language models.
Syllabus
Controlling distribution shifts in language models: a data-centric approach.
Taught by
Simons Institute
Related Courses
Introduction to Deep LearningMassachusetts Institute of Technology via YouTube Taming Dataset Bias via Domain Adaptation
Alexander Amini via YouTube Making Our Models Robust to Changing Visual Environments
Andreas Geiger via YouTube Learning Compact Representation with Less Labeled Data from Sensors
tinyML via YouTube Geo-localization Framework for Real-world Scenarios - Defense Presentation
University of Central Florida via YouTube