YoVDO

Parallelizing Your ETL with Dask on Kubeflow

Offered By: MLOps World: Machine Learning in Production via YouTube

Tags

Dask Courses Machine Learning Courses Python Courses Kubernetes Courses MLOps Courses Data Processing Courses Distributed Computing Courses ETL Courses Kubeflow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to parallelize ETL processes using Dask on Kubeflow in this comprehensive conference talk. Explore the integration of Dask, a powerful Python library for parallel computing, with Kubeflow, a popular MLOps platform built on Kubernetes. Discover how to leverage Dask's advanced parallelism capabilities within Kubeflow's notebook service and pipeline workflows. Gain insights into the new Dask Operator for Kubernetes, which enables users to launch Dask clusters from Jupyter sessions and pipeline steps. Understand how to utilize Dask's distributed computing power to process larger-than-memory datasets and optimize performance in machine learning pipelines. Follow along as the speaker demonstrates installation procedures, provides practical examples, and showcases the benefits of combining Dask and Kubeflow for efficient data processing and ML workflows.

Syllabus

Parallelizing Your ETL with Dask on Kubeflow


Taught by

MLOps World: Machine Learning in Production

Related Courses

Building End-to-end Machine Learning Workflows with Kubeflow
Pluralsight
Smart Analytics, Machine Learning, and AI on GCP
Pluralsight
Leveraging Cloud-Based Machine Learning on Google Cloud Platform: Real World Applications
LinkedIn Learning
Distributed TensorFlow - TensorFlow at O'Reilly AI Conference, San Francisco '18
TensorFlow via YouTube
KFServing - Model Monitoring with Apache Spark and Feature Store
Databricks via YouTube