YoVDO

Understanding Mixture of Experts in Large Language Models

Offered By: Trelis Research via YouTube

Tags

Machine Learning Courses Neural Networks Courses GPT-3 Courses GPT-4 Courses Binary Tree Courses Loss Functions Courses Model Training Courses Mixture-of-Experts Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the concept of Mixture of Experts (MoE) in this 28-minute video lecture. Delve into the rationale behind MoE, its training process, and potential challenges. Learn about techniques like adding noise during training and adjusting loss functions for router evenness. Examine the applicability of MoE for large language models on laptops and its potential benefits for major AI companies. Investigate the binary tree MoE (fast feed forward) approach and compare performance data between GPT, MoE, and FFF models. Analyze the inference speed improvements with binary tree MoE and evaluate the overall viability of MoE in various contexts. Gain insights into why large companies might adopt MoE technology for their AI systems.

Syllabus

GPT-3, GPT-4 and Mixture of Experts
Why Mixture of Experts?
The idea behind Mixture of Experts
How to train MoE
Problems training MoE
Adding noise during training
Adjusting the loss function for router evenness
Is MoE useful for LLMs on laptops?
How might MoE help big companies like OpenAI?
Disadvantages of MoE
Binary tree MoE fast feed forward
Data on GPT vs MoE vs FFF
Inference speed up with binary tree MoE
Recap - Does MoE make sense?
Why might big companies use MoE?


Taught by

Trelis Research

Related Courses

TensorFlow Developer Certificate Exam Prep
A Cloud Guru
Post Graduate Certificate in Advanced Machine Learning & AI
Indian Institute of Technology Roorkee via Coursera
Advanced AI Techniques for the Supply Chain
LearnQuest via Coursera
Advanced Learning Algorithms
DeepLearning.AI via Coursera
IBM AI Engineering
IBM via Coursera