YoVDO

Next-Generation Networks for Machine Learning

Offered By: Scalable Parallel Computing Lab, SPCL @ ETH Zurich via YouTube

Tags

Machine Learning Courses Network Topologies Courses Congestion Control Courses Deep Neural Networks Courses Distributed Training Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore cutting-edge techniques for accelerating distributed deep neural network (DNN) training in this 50-minute conference talk by Manya Ghobadi at SPCL_Bcast. Delve into the challenges posed by increasing dataset and model sizes, and discover innovative solutions to overcome network bottlenecks in datacenter environments. Learn about a novel optical fabric that optimizes network topology and parallelization strategies for DNN clusters. Examine the limitations of fair-sharing in congestion control algorithms and understand a new scheduling approach that strategically places jobs on network links to enhance performance. Gain insights into the future of machine learning infrastructure and network design for improved training efficiency.

Syllabus

Introduction
Talk
Announcements


Taught by

Scalable Parallel Computing Lab, SPCL @ ETH Zurich

Related Courses

Custom and Distributed Training with TensorFlow
DeepLearning.AI via Coursera
Architecting Production-ready ML Models Using Google Cloud ML Engine
Pluralsight
Building End-to-end Machine Learning Workflows with Kubeflow
Pluralsight
Deploying PyTorch Models in Production: PyTorch Playbook
Pluralsight
Inside TensorFlow
TensorFlow via YouTube