Mixtral of Experts - Paper Explained
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore an in-depth analysis of the Mixtral of Experts paper in this comprehensive video lecture. Delve into the intricacies of Sparse Mixture of Experts (SMoE) language models, comparing Mixtral 8x7B's architecture to Mistral 7B and examining its performance against Llama 2 70B and GPT-3.5. Learn about expert routing, sparse expert routing, and expert parallelism. Discover the experimental results, routing analysis, and conclusions drawn from this groundbreaking research in natural language processing and artificial intelligence.
Syllabus
- Introduction
- Mixture of Experts
- Classic Transformer Blocks
- Expert Routing
- Sparse Expert Routing
- Expert Parallelism
- Experimental Results
- Routing Analysis
- Conclusion
Taught by
Yannic Kilcher
Related Courses
Artificial Intelligence Foundations: Neural NetworksLinkedIn Learning Transformers: Text Classification for NLP Using BERT
LinkedIn Learning TensorFlow: Working with NLP
LinkedIn Learning BERTによる自然言語処理を学ぼう! -Attention、TransformerからBERTへとつながるNLP技術-
Udemy Complete Natural Language Processing Tutorial in Python
Keith Galli via YouTube