Horovod - Distributed Deep Learning for Reliable MLOps
Offered By: Linux Foundation via YouTube
Course Description
Overview
Syllabus
Intro
Early Adoption of Horovod
Deep Learning Refresher
Distributed Deep Learning
Early Distributed Training - Parameter Servers
Parameter Servers - Tradeoffs
Horovod Technique: Allreduce
Benchmarking
Deep Learning in Research
Deep Learning in Production
Feature Store
Model Training
Preprocessing
Spark ML Pipelines
Petastorm: Data Access for Deep Learning Training Challenges of Training on Large Datasets
Spark 3.0: Resource Aware Scheduling
What if my Spark cluster doesn't have GPUs? Horovod Lambda - Run data processing on CPUs with Spark
Online Prediction
Neuropod: Out-of-Process Execution
Workflow Authoring Can we ideate, define, evaluate and deploy a Deep Learning model all within a single script?
Feature Engineering
Model Construction
Model Deployment
Elastic Horovod: Control Flow
Taught by
Linux Foundation
Tags
Related Courses
Machine Learning Operations (MLOps): Getting StartedGoogle Cloud via Coursera Проектирование и реализация систем машинного обучения
Higher School of Economics via Coursera Demystifying Machine Learning Operations (MLOps)
Pluralsight Machine Learning Engineer with Microsoft Azure
Microsoft via Udacity Machine Learning Engineering for Production (MLOps)
DeepLearning.AI via Coursera