Ray Train: A Production-Ready Library for Distributed Deep Learning
Offered By: Anyscale via YouTube
Course Description
Overview
Explore the architecture and capabilities of Ray Train, a cutting-edge library for seamless, production-ready distributed deep learning, in this 32-minute conference talk. Dive deep into Ray Train's advanced resource scheduling, simple APIs for ecosystem integrations, and exclusive features for Large Language Model (LLM) training. Learn how Ray Train offers robust solutions for large-scale distributed training, seamlessly integrates with popular deep learning frameworks, and accelerates LLM development with built-in fault tolerance and resource management. Discover how this open-source library addresses the growing complexity of deep learning models and the emergence of generative AI, providing efficient and cost-effective scaling for training. Gain insights from software engineer Yunxuan Xiao of Anyscale, who shares his passion for scaling AI workloads and making machine learning more accessible and efficient.
Syllabus
Ray Train: A Production-Ready Library for Distributed Deep Learning
Taught by
Anyscale
Related Courses
Challenges and Opportunities in Applying Machine Learning - Alex Jaimes - ODSC East 2018Open Data Science via YouTube Efficient Distributed Deep Learning Using MXNet
Simons Institute via YouTube Benchmarks and How-Tos for Convolutional Neural Networks on HorovodRunner-Enabled Apache Spark Clusters
Databricks via YouTube SHADE - Enable Fundamental Cacheability for Distributed Deep Learning Training
USENIX via YouTube Alpa - Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
USENIX via YouTube