YoVDO

Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine

Offered By: GAIA via YouTube

Tags

pandas Courses Data Science Courses Big Data Courses Python Courses Data Processing Courses GPU Acceleration Courses Dask Courses Apache Arrow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive comparison of Pandas 2, Dask, and Polars for efficiently handling large datasets on a single machine in this informative 28-minute conference talk. Delve into the latest advancements in data processing tools, including Pandas 2's new Arrow data types, faster calculations, and improved scalability. Learn about Dask's ability to scale Pandas across cores and its recent "expressions" optimization. Discover Polars, a new competitor designed around Arrow with native multicore support. Gain insights into solving a "just about fits in RAM" data task using these three solutions, understanding their pros and cons to make informed decisions for research workflows. Examine whether Pandas operations still require 5x working RAM, the speed improvements in Pandas string operations, and the compatibility of Polars with tools like Scikit-learn and matplotlib. Presented by Ian Ozsvald, an experienced Chief Data Scientist and author, this talk offers valuable knowledge for data scientists and researchers looking to optimize their data processing techniques.

Syllabus

Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine by Ian Ozsvald


Taught by

GAIA

Related Courses

Parallel Programming with Dask in Python
DataCamp
Parallel Programming with Dask in Python
DataCamp
Faster pandas
LinkedIn Learning
Scaling Python Data Applications with Dask
Pluralsight
Trabajando con Dask
Coursera Project Network via Coursera