Operating Deep Learning Pipelines Anywhere Using Kubeflow
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore the intricacies of operating deep learning pipelines using Kubeflow in this comprehensive conference talk by Jörg Schad and Gilbert Song from Mesosphere. Dive into the process of building a production-grade data science pipeline, integrating Kubeflow with open-source data, streaming, and CI/CD automation tools. Learn about essential components such as data preparation using Apache Spark or Apache Flink, data storage with HDFS and Cassandra, automation via Jenkins, and request streaming with Apache Kafka. Discover how to construct and manage a complete deep learning pipeline for multiple tenants, covering topics like data cleansing, model storage, distributed training, monitoring, and infrastructure management. Gain insights into addressing challenges in data quality, division of labor between data scientists and software engineers, and implementing data science engineering principles. Explore advanced concepts including reproducible builds, MLflow integration, feature catalogs, model libraries, and resource service management to enhance your deep learning pipeline operations.
Syllabus
Introduction
The Advanced Pipeline
Challenges
Data Quality
Division of Labor
Data Science vs Software Engineering
Data Science Engineering Principles
Ian Good
Requirements engineering
Reproducible builds
mlflow
data scientist
Jupiter
Feature Catalog
Model Libraries
Model Optimization
Resource Service Management
Wishlist
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera