YoVDO

Parallelizing Your ETL with Dask on Kubeflow

Offered By: MLOps World: Machine Learning in Production via YouTube

Tags

Dask Courses Machine Learning Courses Python Courses Kubernetes Courses MLOps Courses Data Processing Courses Distributed Computing Courses ETL Courses Kubeflow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to parallelize ETL processes using Dask on Kubeflow in this comprehensive conference talk. Explore the integration of Dask, a powerful Python library for parallel computing, with Kubeflow, a popular MLOps platform built on Kubernetes. Discover how to leverage Dask's advanced parallelism capabilities within Kubeflow's notebook service and pipeline workflows. Gain insights into the new Dask Operator for Kubernetes, which enables users to launch Dask clusters from Jupyter sessions and pipeline steps. Understand how to utilize Dask's distributed computing power to process larger-than-memory datasets and optimize performance in machine learning pipelines. Follow along as the speaker demonstrates installation procedures, provides practical examples, and showcases the benefits of combining Dask and Kubeflow for efficient data processing and ML workflows.

Syllabus

Parallelizing Your ETL with Dask on Kubeflow


Taught by

MLOps World: Machine Learning in Production

Related Courses

4.0 Shades of Digitalisation for the Chemical and Process Industries
University of Padova via FutureLearn
A Day in the Life of a Data Engineer
Amazon Web Services via AWS Skill Builder
FinTech for Finance and Business Leaders
ACCA via edX
Accounting Data Analytics
University of Illinois at Urbana-Champaign via Coursera
Accounting Data Analytics
Coursera