YoVDO

DataFusion and Apache Arrow: Supercharging Data Analytics with a Rust-Based Query Engine

Offered By: Databricks via YouTube

Tags

Apache Arrow Courses SQL Courses Rust Courses Data Analytics Courses Distributed Computing Courses LLVM Courses

Course Description

Overview

Discover how Rust, Apache Arrow, and the Data Fusion Query Engine are revolutionizing modern data stacks in this 29-minute video from Databricks. Explore the implementation timeline for new database systems, LLVM-like infrastructure for databases, and DataFusion's growth and milestones. Learn about DataFusion's features, including SQL support and extensibility, and examine real-world applications like VegaFusion and Cube.js. Gain insights into future directions for distributed computing with Ballista and understand how these technologies can supercharge your data analytical tools.

Syllabus

Introduction
What is going on?
Apache Arrow
Implementation timeline for a new Database system
LLVM Compiler Infrastructure
LLVM-like Infrastructure for Databases
DataFusion Project Growth
DataFusion Milestones: Time to Mature
Initial Logical Plan
Let's Optimize!
The Initial Execution Plan
DataFusion Features
SQL Support
Extensibility
VegaFusion
Cube.js / Cube Store
Coralogix
blaze-rs
Ballista Distributed Compute
Future Directions


Taught by

Databricks

Related Courses

Machine Learning with RAPIDS - Accelerating Data Science Workflows
Nvidia via YouTube
Streaming Featurization with Ibis, Substrait and Apache Arrow
Open Data Science via YouTube
Sound Data Engineering in Rust - From Bits to DataFrames
Databricks via YouTube
Cloud Fetch: High-Bandwidth Connectivity for BI Tools - Databricks
Databricks via YouTube
Data Science Across Data Sources with Apache Arrow - Accelerating Analytics and Interoperability
Databricks via YouTube