Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine
Offered By: GAIA via YouTube
Course Description
Overview
Explore a comprehensive comparison of Pandas 2, Dask, and Polars for efficiently handling large datasets on a single machine in this informative 28-minute conference talk. Delve into the latest advancements in data processing tools, including Pandas 2's new Arrow data types, faster calculations, and improved scalability. Learn about Dask's ability to scale Pandas across cores and its recent "expressions" optimization. Discover Polars, a new competitor designed around Arrow with native multicore support. Gain insights into solving a "just about fits in RAM" data task using these three solutions, understanding their pros and cons to make informed decisions for research workflows. Examine whether Pandas operations still require 5x working RAM, the speed improvements in Pandas string operations, and the compatibility of Polars with tools like Scikit-learn and matplotlib. Presented by Ian Ozsvald, an experienced Chief Data Scientist and author, this talk offers valuable knowledge for data scientists and researchers looking to optimize their data processing techniques.
Syllabus
Pandas 2, Dask or Polars? Quickly Tackling Larger Data on a Single Machine by Ian Ozsvald
Taught by
GAIA
Related Courses
Coding the Matrix: Linear Algebra through Computer Science ApplicationsBrown University via Coursera كيف تفكر الآلات - مقدمة في تقنيات الحوسبة
King Fahd University of Petroleum and Minerals via Rwaq (رواق) Datascience et Analyse situationnelle : dans les coulisses du Big Data
IONIS via IONIS Data Lakes for Big Data
EdCast 統計学Ⅰ:データ分析の基礎 (ga014)
University of Tokyo via gacco