YoVDO

Building Your Own ChatGPT-style LLM AI Infrastructure with Kubernetes

Offered By: Tejas Kumar via YouTube

Tags

Kubernetes Courses Data Engineering Courses GPU Computing Courses Scalability Courses OpenAI Courses vLLM Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of building a ChatGPT-style LLM AI infrastructure using Kubernetes in this comprehensive video featuring John McBride. Delve into the challenges and solutions of deploying open-source AI technologies at scale, with a focus on Kubernetes as a platform for running compute-intensive tasks. Learn about the decision-making process behind choosing TimeScaleDB for storing time-series data and vectors, and gain insights into migrating from OpenAI to an open-source large language model inference engine. Discover the importance of selecting the right level of abstraction, understanding trade-offs, and evaluating language model performance. The video also covers practical aspects such as deploying Kubernetes, setting up node groups with GPUs, and using VLLM as the inference engine. Whether you're a startup considering Kubernetes adoption or an experienced developer looking to optimize AI infrastructure, this talk provides valuable takeaways on building and managing AI-enabled applications at scale.

Syllabus

John McBride
Introduction and Background
Summary of the Blog Post
The Role of Kubernetes in AI-Enabled Applications
The Use of TimeScaleDB for Storing Time-Series Data and Vectors
Migrating to an Open-Source LLM Inference Engine
Deploying Kubernetes and Setting Up Node Groups
Choosing VLLM as the Inference Engine
The Migration Process: Deploying Kubernetes and Setting Up Node Groups
Choosing the Right Level of Abstraction
Challenges in Evaluating Language Model Performance
Considerations for Adopting Kubernetes in Startups


Taught by

Tejas Kumar

Related Courses

Моделирование биологических молекул на GPU (Biomolecular modeling on GPU)
Moscow Institute of Physics and Technology via Coursera
Practical Deep Learning For Coders
fast.ai via Independent
GPU Architectures And Programming
Indian Institute of Technology, Kharagpur via Swayam
Perform Real-Time Object Detection with YOLOv3
Coursera Project Network via Coursera
Getting Started with PyTorch
Coursera Project Network via Coursera