YoVDO

Parallelizing Your ETL with Dask on Kubeflow

Offered By: MLOps World: Machine Learning in Production via YouTube

Tags

Dask Courses Machine Learning Courses Python Courses Kubernetes Courses MLOps Courses Data Processing Courses Distributed Computing Courses ETL Courses Kubeflow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to parallelize ETL processes using Dask on Kubeflow in this comprehensive conference talk. Explore the integration of Dask, a powerful Python library for parallel computing, with Kubeflow, a popular MLOps platform built on Kubernetes. Discover how to leverage Dask's advanced parallelism capabilities within Kubeflow's notebook service and pipeline workflows. Gain insights into the new Dask Operator for Kubernetes, which enables users to launch Dask clusters from Jupyter sessions and pipeline steps. Understand how to utilize Dask's distributed computing power to process larger-than-memory datasets and optimize performance in machine learning pipelines. Follow along as the speaker demonstrates installation procedures, provides practical examples, and showcases the benefits of combining Dask and Kubeflow for efficient data processing and ML workflows.

Syllabus

Parallelizing Your ETL with Dask on Kubeflow


Taught by

MLOps World: Machine Learning in Production

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent