YoVDO

Faster pandas

Offered By: LinkedIn Learning

Tags

pandas Courses Data Analysis Courses Memory Management Courses Dask Courses Numba Courses Cython Courses

Course Description

Overview

Learn how to make your pandas code quicker and more efficient. This course covers vectorization, common mistakes, pandas performance, saving memory, Numba, Cython, and more.

Syllabus

Introduction
  • pandas and performance
  • What you should know
  • Working with the files on GitHub
1. Overview
  • Why performance matters
  • Setting goals
  • Measuring performance
  • Profiling
  • Challenge: Identify bottleneck
  • Solution: Identify bottleneck
2. Vectorization
  • What is vectorization?
  • Boolean indexing
  • Understanding ufuncs
  • Challenge: Selecting and manipulating data
  • Solution: Selecting and manipulating data
3. Common Mistakes
  • The limitations of appending
  • The limitations of object dtype
  • The limitations of row iteration
  • Understanding the isin function
  • Parsing time once
  • Challenge: Query a DataFrame
  • Solution: Query a DataFrame
4. pandas Performance
  • Using built-in functions
  • Understanding eval and query
  • Understanding the join function
  • Challenge: Join and query
  • Solution: Join and query
5. Saving Memory
  • Why memory is important?
  • Measuring memory
  • Loading parts of data
  • Categorical data
  • Challenge: Reducing memory
  • Solution: Reducing memory
6. Fast Serialization
  • Various formats and why not CSV
  • Optimizing with SQL
  • Optimizing with HDF5
  • Challenge: Bike ride duration
  • Solution: Bike ride duration
7. Numba and Cython
  • What is Numba?
  • Using Numba
  • What's Cython?
  • Writing Cython code
  • Compiling Cython
  • %%cython magic
  • Challenge: Cython speedup
  • Solution: Cython speedup
8. Alternative DataFrames
  • Overview of alternative DataFrames
  • Using Dask
  • Using Vaex
  • Challenge: Vaex vs. pandas
  • Solution: Vaex vs. pandas
Conclusion
  • Next steps

Taught by

Miki Tebeka

Related Courses

Faster Python Code
LinkedIn Learning
Python for Engineers and Scientists
LinkedIn Learning
Acelere su Código Python con Numba y Cupy
The Machine Learning Engineer via YouTube
Accelerating Python Code with Numba and CuPy for Machine Learning
The Machine Learning Engineer via YouTube
Numba - A JIT Compiler for Fast Numerical Code
EuroPython Conference via YouTube