How to Build an LLM from Scratch - An Overview
Offered By: Shaw Talebi via YouTube
Course Description
Overview
Dive into a comprehensive 36-minute video tutorial on building Large Language Models (LLMs) from scratch. Explore key aspects of developing foundation LLMs based on models like GPT-3, Llama, and Falcon. Learn about the four crucial steps: data curation, model architecture, training at scale, and evaluation. Discover data sources, diversity, and preparation techniques. Understand transformer architectures, design choices, and model sizing. Gain insights into training stability, hyperparameter tuning, and various evaluation methods for both multiple-choice and open-ended tasks. Access numerous resources and references to deepen your understanding of LLM development.
Syllabus
Intro -
How much does it cost? -
4 Key Steps -
Step 1: Data Curation -
1.1: Data Sources -
1.2: Data Diversity -
1.3: Data Preparation -
Step 2: Model Architecture Transformers -
2.1: 3 Types of Transformers -
2.2: Other Design Choices -
2.3: How big do I make it? -
Step 3: Training at Scale -
3.1: Training Stability -
3.2: Hyperparameters -
Step 4: Evaluation -
4.1: Multiple-choice Tasks -
4.2: Open-ended Tasks -
What's next? -
Taught by
Shaw Talebi
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent