Controlling Distribution Shifts in Language Models: A Data-Centric Approach
Offered By: Simons Institute via YouTube
Course Description
Overview
Explore a lecture on controlling distribution shifts in language models through data-centric approaches. Delve into Tatsunori Hashimoto's presentation from Stanford University, part of the Emerging Generalization Settings series at the Simons Institute. Examine the challenges of cross-task and cross-domain generalization in NLP, focusing on the trade-offs between generalization and control in language model pretraining. Discover two complementary strategies: algorithmic data filtering to prioritize benchmark-relevant training data and domain adaptation through large-scale synthesis of domain-specific pretraining data. Gain insights into addressing the gaps between pretraining and target evaluation caused by distribution shifts in language models.
Syllabus
Controlling distribution shifts in language models: a data-centric approach.
Taught by
Simons Institute
Related Courses
Microsoft Bot Framework and Conversation as a PlatformMicrosoft via edX Unlocking the Power of OpenAI for Startups - Microsoft for Startups
Microsoft via YouTube Improving Customer Experiences with Speech to Text and Text to Speech
Microsoft via YouTube Stanford Seminar - Deep Learning in Speech Recognition
Stanford University via YouTube Select Topics in Python: Natural Language Processing
Codio via Coursera