OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Offered By: Unify via YouTube
Course Description
Overview
Explore a comprehensive presentation on OpenMoE, an early effort in open mixture-of-experts language models, delivered by Fuzhao Xue. Dive into the intricacies of this innovative approach to large language models, including the development of a series of open-source, decoder-only MoE LLMs ranging from 650M to 34B parameters. Learn about the cost-effectiveness of MoE models compared to dense LLMs, and gain insights into the routing mechanisms within these models. Discover key concepts such as Context-Independent Specialization and the challenges in routing decisions. Access additional resources, including the original research paper and related content from Unify, to deepen your understanding of this cutting-edge AI technology.
Syllabus
OpenMoE Explained
Taught by
Unify
Related Courses
GShard- Scaling Giant Models with Conditional Computation and Automatic ShardingYannic Kilcher via YouTube Learning Mixtures of Linear Regressions in Subexponential Time via Fourier Moments
Association for Computing Machinery (ACM) via YouTube Modules and Architectures
Alfredo Canziani via YouTube Stanford Seminar - Mixture of Experts Paradigm and the Switch Transformer
Stanford University via YouTube Decoding Mistral AI's Large Language Models - Building Blocks and Training Strategies
Databricks via YouTube