YoVDO

Running Multiple Models on the Same GPU on Spot Instances

Offered By: MLOps World: Machine Learning in Production via YouTube

Tags

Machine Learning Inference Courses Cloud Computing Courses MLOps Courses Cloud Infrastructure Courses Model Deployment Courses Cost Optimization Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover cost-effective strategies for running machine learning inference in the cloud through this 33-minute conference talk from MLOps World: Machine Learning in Production. Explore GPU fractionalization and the use of Spot instances as presented by Oscar Rovira, Co-founder of Mystic AI. Learn about the benefits and limitations of GPU fractionalization, as well as the value and potential challenges of utilizing Spot instances. Gain insights into how combining these approaches can significantly increase throughput and reduce costs for your GenAI applications, with practical examples provided to illustrate these optimization techniques.

Syllabus

Running Multiple Models on the Same GPU, on Spot Instances


Taught by

MLOps World: Machine Learning in Production

Related Courses

Introduction to AWS Inferentia and Amazon EC2 Inf1 Instances
Pluralsight
Introduction to AWS Inferentia and Amazon EC2 Inf1 Instances (Korean)
Amazon Web Services via AWS Skill Builder
Introduction to Amazon Elastic Inference
Amazon Web Services via AWS Skill Builder
TensorFlow Lite - Solution for Running ML On-Device
TensorFlow via YouTube
Inference on KubeEdge
Linux Foundation via YouTube