YoVDO

Parallelizing Your ETL with Dask on Kubeflow

Offered By: MLOps World: Machine Learning in Production via YouTube

Tags

Dask Courses Machine Learning Courses Python Courses Kubernetes Courses MLOps Courses Data Processing Courses Distributed Computing Courses ETL Courses Kubeflow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to parallelize ETL processes using Dask on Kubeflow in this comprehensive conference talk. Explore the integration of Dask, a powerful Python library for parallel computing, with Kubeflow, a popular MLOps platform built on Kubernetes. Discover how to leverage Dask's advanced parallelism capabilities within Kubeflow's notebook service and pipeline workflows. Gain insights into the new Dask Operator for Kubernetes, which enables users to launch Dask clusters from Jupyter sessions and pipeline steps. Understand how to utilize Dask's distributed computing power to process larger-than-memory datasets and optimize performance in machine learning pipelines. Follow along as the speaker demonstrates installation procedures, provides practical examples, and showcases the benefits of combining Dask and Kubeflow for efficient data processing and ML workflows.

Syllabus

Parallelizing Your ETL with Dask on Kubeflow


Taught by

MLOps World: Machine Learning in Production

Related Courses

Introduction to Cloud Infrastructure Technologies
Linux Foundation via edX
Scalable Microservices with Kubernetes
Google via Udacity
Google Cloud Fundamentals: Core Infrastructure
Google via Coursera
Introduction to Kubernetes
Linux Foundation via edX
Fundamentals of Containers, Kubernetes, and Red Hat OpenShift
Red Hat via edX