Heterogeneous Training Cluster with Ray at Netflix
Offered By: Anyscale via YouTube
Course Description
Overview
Explore the benefits of using Ray to build a heterogeneous training cluster for deep learning models at Netflix. Learn how to set up a cluster with a mix of CPU and GPU instances, run distributed training jobs, and leverage Ray's automatic resource allocation for scheduling different types of workers. Discover best practices for configuring and managing persistent clusters using Ray, while addressing challenges in building and maintaining such systems. Gain insights into how Netflix's Machine Learning Platform team optimizes infrastructure for various use cases, including recommendations, content understanding, and artwork generation. Understand the importance of reliable, scalable, and robust training and deployment of machine learning models in the entertainment industry.
Syllabus
Heterogeneous Training Cluster with Ray at Netflix
Taught by
Anyscale
Related Courses
Financial Sustainability: The Numbers side of Social Enterprise+Acumen via NovoEd Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera Developing Repeatable ModelsĀ® to Scale Your Impact
+Acumen via Independent Managing Microsoft Windows Server Active Directory Domain Services
Microsoft via edX Introduction aux conteneurs
Microsoft Virtual Academy via OpenClassrooms