Imbue - Training a 70B Model from Scratch - Infrastructure and Challenges
Offered By: Aleksa Gordić - The AI Epiphany via YouTube
Course Description
Overview
Dive into a comprehensive 59-minute video featuring Bowei from Imbue discussing their ambitious project of training a 70B model from scratch. Explore the intricate details of building infrastructure to support such a massive undertaking, as outlined in Imbue's detailed blog post. Learn about Bowei's background, Imbue's research focus, and the challenges of training a 70B model. Gain insights into the process of building a cluster from scratch, and enjoy anecdotes and a Q&A session. The video covers topics ranging from Hyperstack GPUs to the intricacies of large-scale model training, offering valuable knowledge for those interested in cutting-edge AI infrastructure and development.
Syllabus
00:00 - Intro
00:45 - Hyperstack GPUs sponsored
02:25 - Bowei's background
11:30 - More on Imbue, their research, their focus
18:30 - Training a 70B model
26:20 - Building a cluster from scratch
45:40 - Anecdotes, Q&A
Taught by
Aleksa Gordić - The AI Epiphany
Related Courses
Cloud Computing Concepts, Part 1University of Illinois at Urbana-Champaign via Coursera Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera Reliable Distributed Algorithms - Part 1
KTH Royal Institute of Technology via edX Introduction to Apache Spark and AWS
University of London International Programmes via Coursera Réalisez des calculs distribués sur des données massives
CentraleSupélec via OpenClassrooms