Towards Extreme Multi-Task Scaling for Transfer Learning
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore an in-depth analysis of the ExT5 model, which pushes the limits of T5 by pre-training on 107 different supervised NLP tasks using the ExMix dataset. Learn about the model's architecture, task formulations, and performance compared to T5 baselines. Discover insights on multi-task scaling, co-training transfer among task families, and the impact of self-supervised data in pre-training. Gain understanding of the ExT5's improved performance on various NLP tasks and its enhanced sample efficiency during pre-training. This comprehensive video covers topics such as task selection, pre-training vs. pre-finetuning, and experimental results, providing valuable insights for researchers and practitioners in the field of natural language processing and transfer learning.
Syllabus
- Intro & Overview
- Recap: The T5 model
- The ExT5 model and task formulations
- ExMix dataset
- Do different tasks help each other?
- Which tasks should we include?
- Pre-Training vs Pre-Finetuning
- A few hypotheses about what's going on
- How much self-supervised data to use?
- More experimental results
- Conclusion & Summary
Taught by
Yannic Kilcher
Related Courses
The History and Relevance of the Rise of Generative AIVanderbilt University via Coursera Visión artificial contemporánea
Universidad de los Andes via Coursera Artificial Intelligence Foundations: Thinking Machines
LinkedIn Learning Generative AI vs. Traditional AI
LinkedIn Learning A Critical Analysis of Self-Supervision, or What We Can Learn From a Single Image
Yannic Kilcher via YouTube