YoVDO

Apache Arrow and Substrait - The Secret Foundations of Data Engineering

Offered By: EuroPython Conference via YouTube

Tags

EuroPython Courses pandas Courses Data Engineering Courses Apache Arrow Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover the transformative impact of Apache Arrow and Substrait on data engineering in this 44-minute conference talk from EuroPython 2023. Explore how PyArrow, the Python library for Apache Arrow, is becoming the de facto standard for data transfer and interoperability across libraries and languages. Learn about the growing adoption of Substrait as the standard representation for query plans, enabling seamless routing and decomposition of queries across different engines. Gain insights into how popular Python libraries like Pandas and Polars leverage Arrow, and understand how compute engines such as Velox, Datafusion, and Acero are embracing both Arrow and Substrait. Witness the construction of a basic database system using Arrow and Substrait with minimal code, showcasing the powerful foundations these technologies provide for modern data engineering.

Syllabus

Apache Arrow and Substrait, the secret foundations of Data Engineering — Alessandro Molina


Taught by

EuroPython Conference

Related Courses

Machine Learning with RAPIDS - Accelerating Data Science Workflows
Nvidia via YouTube
Streaming Featurization with Ibis, Substrait and Apache Arrow
Open Data Science via YouTube
Sound Data Engineering in Rust - From Bits to DataFrames
Databricks via YouTube
DataFusion and Apache Arrow: Supercharging Data Analytics with a Rust-Based Query Engine
Databricks via YouTube
Cloud Fetch: High-Bandwidth Connectivity for BI Tools - Databricks
Databricks via YouTube