How to Reduce ML Computing Costs: Building Efficient Multi-Cloud Infrastructure
Offered By: MLOps World: Machine Learning in Production via YouTube
Course Description
Overview
Discover how to significantly reduce machine learning computing costs in this 38-minute conference talk from MLOps World. Learn from Jaeman An, Founder & CEO of VESSL AI, as he shares insights on building time- and cost-effective ML infrastructure. Explore hybrid cloud architectures using Terraform and Kubernetes, cost optimization techniques with spot instances and fractional GPUs, and solutions to common multi-environment challenges. Gain practical knowledge on dataset mounting, network performance optimization, and server monitoring. Follow a step-by-step guide to implement these strategies and potentially achieve over 80% cost savings on ML projects.
Syllabus
Intro
From notebooks to training jobs
Experiment dashboard
Cluster Dashboard
Multi-Cloud ML Infrastructure
Build hybrid cluster with Kubernetes & Terraform
Terraforming AWS Infrastructure
Test AWS Infrastructure
Terraforming GCP Infrastructure
Cloud Troubleshooting
Dataset Mounting
Cluster Management & Monitoring
Common Interface
Fractional GPUs
multicluster-scheduler
reCap: Step-by-Step Guide
Taught by
MLOps World: Machine Learning in Production
Related Courses
Software as a ServiceUniversity of California, Berkeley via Coursera Software Defined Networking
Georgia Institute of Technology via Coursera Pattern-Oriented Software Architectures: Programming Mobile Services for Android Handheld Systems
Vanderbilt University via Coursera Web-Technologien
openHPI Données et services numériques, dans le nuage et ailleurs
Certificat informatique et internet via France Université Numerique