Inside TensorFlow - Parameter Server Training
Offered By: TensorFlow via YouTube
Course Description
Overview
Syllabus
Intro
Parameter Server Training Overview
Adaptive Learning Rate
Synchronous Parameter Server Training
Evaluation by Estimator
Problems with Multi-Client Setup
Benefits of Single-Client Setup
Problems of Single-Client Setup
Schedule/Join APIs
Custom Training Loop with PS
Current Limitations of the APIs
Benefits of Inline Evaluation
Current Limitations of Inline Evaluation
Variable Sharding
Ongoing and Future Work
Runtime, Performance, and Scalability
Parameter server training in runtime
Invoke model func with async schedule API
Distributed functions in PS training
Large embedding model
Performance compared with Estimator
Worker profiles with multi-step packing
Multi-step packing: pros and cons
Preemptions and failures
Fault tolerance: worker failures
Large-scale fault tolerance testing
Run jobs with preemptible resources
Multi-worker testing framework
MLCompass dashboard
Taught by
TensorFlow
Related Courses
Creative Applications of Deep Learning with TensorFlowKadenze Creative Applications of Deep Learning with TensorFlow III
Kadenze Creative Applications of Deep Learning with TensorFlow II
Kadenze 6.S191: Introduction to Deep Learning
Massachusetts Institute of Technology via Independent Learn TensorFlow and deep learning, without a Ph.D.
Google via Independent