YoVDO

Imbue - Training a 70B Model from Scratch - Infrastructure and Challenges

Offered By: Aleksa Gordić - The AI Epiphany via YouTube

Tags

Machine Learning Courses Artificial Intelligence Courses High Performance Computing Courses Distributed Computing Courses Data Centers Courses Model Training Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into a comprehensive 59-minute video featuring Bowei from Imbue discussing their ambitious project of training a 70B model from scratch. Explore the intricate details of building infrastructure to support such a massive undertaking, as outlined in Imbue's detailed blog post. Learn about Bowei's background, Imbue's research focus, and the challenges of training a 70B model. Gain insights into the process of building a cluster from scratch, and enjoy anecdotes and a Q&A session. The video covers topics ranging from Hyperstack GPUs to the intricacies of large-scale model training, offering valuable knowledge for those interested in cutting-edge AI infrastructure and development.

Syllabus

00:00 - Intro
00:45 - Hyperstack GPUs sponsored
02:25 - Bowei's background
11:30 - More on Imbue, their research, their focus
18:30 - Training a 70B model
26:20 - Building a cluster from scratch
45:40 - Anecdotes, Q&A


Taught by

Aleksa Gordić - The AI Epiphany

Related Courses

High Performance Computing
Georgia Institute of Technology via Udacity
Введение в параллельное программирование с использованием OpenMP и MPI
Tomsk State University via Coursera
High Performance Computing in the Cloud
Dublin City University via FutureLearn
Production Machine Learning Systems
Google Cloud via Coursera
LAFF-On Programming for High Performance
The University of Texas at Austin via edX