YoVDO

Host Your Own Llama 3 Chatbot in 10 Minutes with Runpod and vLLM - Lecture 3

Offered By: Data Centric via YouTube

Tags

WebAssembly Courses AI Chatbots Courses AI Engineering Courses RunPod Courses vLLM Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to host a Llama 3 8B model chatbot in just 20 minutes using vLLM's inference server, Runpod GPUs, and Chainlit for the front end. Discover the process of hosting the Llama 3 model on Runpod and creating an efficient chatbot without relying on heavy frameworks. Follow along as the video guides you through building a Runpod template, deploying it, obtaining the endpoint, preparing the Python script, and finally launching the chatbot. Gain practical insights into AI engineering and model hosting, with additional resources provided for further learning and development.

Syllabus

Intro:
Build Runpod Template:
Deploy Runpod Template:
Getting the Endpoint:
Prepping the Python Script:
Launching the Chatbot:


Taught by

Data Centric

Related Courses

Finetuning, Serving, and Evaluating Large Language Models in the Wild
Open Data Science via YouTube
Cloud Native Sustainable LLM Inference in Action
CNCF [Cloud Native Computing Foundation] via YouTube
Optimizing Kubernetes Cluster Scaling for Advanced Generative Models
Linux Foundation via YouTube
LLaMa for Developers
LinkedIn Learning
Scaling Video Ad Classification Across Millions of Classes with GenAI
Databricks via YouTube