YoVDO

Resource-Aware Scheduling for Production GenAI with RAG on Multicluster Cloud Kubernetes

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Kubernetes Courses Cloud Computing Courses Vector Databases Courses Load Balancing Courses Retrieval Augmented Generation Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive approach to resource-aware scheduling for production GenAI with Retrieval-Augmented Generation (RAG) in a multicluster cloud Kubernetes environment. Dive into the advantages of self-hosting GenAI models, including improved control, privacy, performance, and cost-effectiveness. Learn how Kubernetes cloud resource management provides a flexible hosting platform for these systems. Discover the proposed architecture utilizing multiple Kubernetes clusters and a resource-aware policy-based cluster scheduler. Examine the key components of this setup, including vector databases for RAG contexts, load-balanced query services, prediction services for model execution, and ingestion services for vector database updates. Understand the benefits of using a cloud-native multi-region scalable vector database and running services across different Kubernetes clusters. Gain insights into the geographical distribution of CPU and GPU clusters for optimal reliability, latency, and resource availability. Explore the role of the cluster scheduler in placement and scaling decisions. Analyze the benefits of this approach and learn about a reference implementation to help you apply these concepts in your own GenAI projects.

Syllabus

Resource-Aware Scheduling for Production GenAI with RAG running on Multicluster Cloud Kubernetes


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Software as a Service
University of California, Berkeley via Coursera
Software Defined Networking
Georgia Institute of Technology via Coursera
Pattern-Oriented Software Architectures: Programming Mobile Services for Android Handheld Systems
Vanderbilt University via Coursera
Web-Technologien
openHPI
Données et services numériques, dans le nuage et ailleurs
Certificat informatique et internet via France Université Numerique