YoVDO

Resource-Aware Scheduling for Production GenAI with RAG on Multicluster Cloud Kubernetes

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Kubernetes Courses Cloud Computing Courses Vector Databases Courses Load Balancing Courses Retrieval Augmented Generation Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive approach to resource-aware scheduling for production GenAI with Retrieval-Augmented Generation (RAG) in a multicluster cloud Kubernetes environment. Dive into the advantages of self-hosting GenAI models, including improved control, privacy, performance, and cost-effectiveness. Learn how Kubernetes cloud resource management provides a flexible hosting platform for these systems. Discover the proposed architecture utilizing multiple Kubernetes clusters and a resource-aware policy-based cluster scheduler. Examine the key components of this setup, including vector databases for RAG contexts, load-balanced query services, prediction services for model execution, and ingestion services for vector database updates. Understand the benefits of using a cloud-native multi-region scalable vector database and running services across different Kubernetes clusters. Gain insights into the geographical distribution of CPU and GPU clusters for optimal reliability, latency, and resource availability. Explore the role of the cluster scheduler in placement and scaling decisions. Analyze the benefits of this approach and learn about a reference implementation to help you apply these concepts in your own GenAI projects.

Syllabus

Resource-Aware Scheduling for Production GenAI with RAG running on Multicluster Cloud Kubernetes


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Vector Similarity Search
Data Science Dojo via YouTube
Supercharging Semantic Search with Pinecone and Cohere
Pinecone via YouTube
Search Like You Mean It - Semantic Search with NLP and a Vector Database
Pinecone via YouTube
The Rise of Vector Data
Pinecone via YouTube
NER Powered Semantic Search in Python
James Briggs via YouTube