Enhancing the Performance Testing Process for gRPC Model Inferencing at Scale
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore the intricacies of performance testing for gRPC model inferencing at scale in this informative conference talk. Discover how to set up a Kubernetes cluster with KServe's ModelMesh for high-density deployment of machine learning models. Learn about load testing thousands of models and utilizing Prometheus and Grafana for monitoring key performance metrics. Gain insights into the complexities of model deployment, scalability challenges, and the features of Model Mesh. Delve into the automation of performance testing, including the setup of testing environments, QFlow pipeline, and K6 load tools. Witness a demonstration of the testing process, analyze testing logs and results, and understand the implications of cashmiss actions. Evaluate the benefits of using Model Mesh for your specific use case.
Syllabus
Introduction
Model Deployment
Kubernetes
Complexities
Kserve
Scalability
Model Mesh
Model Mesh Features
Performance Testing Automation
Performance Testing Setup
Performance Testing Environment
QFlow Pipeline
K6 Load Tools
GRPC
Prometheus
Demo
Testing
Testing Log
Testing Results
Cashmiss Action
Should I use Model Mesh
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Serverless Machine Learning Model Inference on Kubernetes with KServeDevoxx via YouTube Machine Learning in Fastly's Compute@Edge
Linux Foundation via YouTube ModelMesh: Scalable AI Model Serving on Kubernetes
Linux Foundation via YouTube MLSecOps - Automated Online and Offline ML Model Evaluations on Kubernetes
Linux Foundation via YouTube Creating a Custom Serving Runtime in KServe ModelMesh - Hands-On Experience
Linux Foundation via YouTube