YoVDO

AI Inference Workloads - Solving MLOps Challenges in Production

Offered By: Toronto Machine Learning Series (TMLS) via YouTube

Tags

MLOps Courses Machine Learning Courses DevOps Courses Kubernetes Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions for AI inference workloads in production environments during this 55-minute conference talk from the Toronto Machine Learning Series. Dive into the complexities of moving machine learning prototypes to production, focusing on throughput, latency, and GPU utilization. Learn about fractional GPU capabilities and their impact on performance. Discover how a leading organization built an inference platform using Kubernetes and NVIDIA A100 MIG technology to scale AI initiatives. Gain insights into deployment types for inference workloads, embedding ML models into web servers, and decoupling web and model serving. Understand the concept of Multi-Instance GPU (MIG) and its applications in model inferencing. Benefit from the speaker's expertise in DevOps, Cloud Computing, Kubernetes, and AI computing to overcome MLOps challenges and optimize your AI inference workflows.

Syllabus

Intro
Agenda
The Machine Learning Process
Deployment Types for Inference Workloads
Machine Learning is Different than Traditional Software Engineering
Low Latency
High Throughput
Maximize GPU Utilization
Embedding ML. Models into Web Servers
Decouple Web Serving and Model Serving
Model Serving System on Kubernetes
Multi-Instance GPU (MIG)
Run:Al's Dynamic MIG Allocations
Run 3 instances of type 2g.10gb
Valid Profiles & Configurations
Serving on Fractional GPUs
A Game Changer for Model Inferencing


Taught by

Toronto Machine Learning Series (TMLS)

Related Courses

Machine Learning Operations (MLOps): Getting Started
Google Cloud via Coursera
Проектирование и реализация систем машинного обучения
Higher School of Economics via Coursera
Demystifying Machine Learning Operations (MLOps)
Pluralsight
Machine Learning Engineer with Microsoft Azure
Microsoft via Udacity
Machine Learning Engineering for Production (MLOps)
DeepLearning.AI via Coursera