Tuning Machine Learning Models - Scaling, Workflows, and Architecture
Offered By: Databricks via YouTube
Course Description
Overview
Explore the intricacies of tuning machine learning models in this 24-minute conference talk from Databricks. Delve into the automation of hyperparameter tuning, scaling techniques using Apache Spark, and best practices for optimizing workflows and architecture. Learn how to leverage Hyperopt, a popular open-source tool for ML tuning in Python, and discover its Spark-powered backend for enhanced scalability. Gain insights into effective tuning workflows, including how to select parameters, track progress, and iterate using MLflow. Examine architectural patterns for both single-machine and distributed ML workflows, and understand how to optimize data ingestion with Spark. Discover the potential of joblib-spark for distributing scikit-learn tuning jobs across Spark clusters. While generally accessible, this talk is particularly valuable for those with knowledge of machine learning and Spark.
Syllabus
Introduction
What are hyper parameters
Tuning ML models
Hyperparameters
Single Machine Training
Distributed Training
Training One Model Per Group
Workflows
Models vs Pipelines
Resources
Taught by
Databricks
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera