Parallel Programming with Dask in Python
Offered By: DataCamp
Course Description
Overview
Learn how to use Python parallel programming with Dask to upscale your workflows and efficiently handle big data.
When working with big data, you’ll face two common obstacles: using too much memory and long runtimes. The Dask library can lower your memory use by loading chunks of data only when needed. It can lower runtimes by using all your available computing cores in parallel. Best of all, it requires very few changes to your existing Python code. In this course, you use Dask to analyze Spotify song data, process images of sign language gestures, calculate trends in weather data, analyze audio recordings, and train machine learning models on big data.
When working with big data, you’ll face two common obstacles: using too much memory and long runtimes. The Dask library can lower your memory use by loading chunks of data only when needed. It can lower runtimes by using all your available computing cores in parallel. Best of all, it requires very few changes to your existing Python code. In this course, you use Dask to analyze Spotify song data, process images of sign language gestures, calculate trends in weather data, analyze audio recordings, and train machine learning models on big data.
Syllabus
- Lazy Evaluation and Parallel Computing
- This chapter will teach you the basics of Dask and lazy evaluation. At the end of this chapter, you'll be able to speed up almost any Python code by using parallel processing or multi-threading. You'll learn the difference between these two task scheduling methods and which one is better under which circumstances.
- Parallel Processing of Big, Structured Data
- Here you’ll learn how to analyze big structured data using Dask arrays and Dask DataFrames. You'll learn how everything you know about NumPy and pandas can easily be applied to data that is too large to fit in memory.
- Dask Bags for Unstructured Data
- Process any kind of data. You'll learn how Dask bags can be used to efficiently process unstructured text data, semi-structured JSON data, and even recorded audio.
- Dask Machine Learning and Final Pieces
- Harness the power of Dask to train machine learning models. You'll learn how to train machine learning models on big data using the Dask-ML package, and how to split Dask calculations across a mixture of processes and threads for even greater computing speed.
Taught by
James Fulton
Related Courses
Intro to Parallel ProgrammingNvidia via Udacity Introduction to Linear Models and Matrix Algebra
Harvard University via edX Введение в параллельное программирование с использованием OpenMP и MPI
Tomsk State University via Coursera Supercomputing
Partnership for Advanced Computing in Europe via FutureLearn Fundamentals of Parallelism on Intel Architecture
Intel via Coursera