YoVDO

Sound Data Engineering in Rust - From Bits to DataFrames

Offered By: Databricks via YouTube

Tags

Rust Courses Apache Spark Courses JDBC Courses Data Engineering Courses DataFrames Courses Parquet Courses Apache Arrow Courses

Course Description

Overview

Explore sound data engineering principles in Rust, from fundamental bits to advanced DataFrames, in this 35-minute Databricks conference talk. Dive into Spark's Data Source APIs and their optimization techniques for querying external data sources. Learn about filter push down, column pruning, and the newly introduced partial aggregate push down, which significantly improves query performance. Discover how these optimizations are implemented in JDBC and Parquet. Examine the relationship between information storage and usage, CPU utilization, and the role of Apache Arrow in data processing. Gain insights into the Rust programming language and its integration with Arrow. Watch a demo showcasing practical applications, explore who uses arrow2, and understand the benefits of Polars. Analyze benchmarks and key takeaways from the DATA+AI SUMMIT 2022 presentation on efficient data engineering practices.

Syllabus

Intro
Background
Outline
Information is both stored and used
"Read" uses 10 and CPU
CPUs sleep and run
Apache Arrow
Rust Programming Language
Arrow with Rust
Demo
Who uses arrow2
Polars
Benchmarks
In summary
DATA+AI SUMMIT 2022


Taught by

Databricks

Related Courses

Java Bootcamp (with Java 17)
Udemy
Ultimate Java Bootcamp | Build Java GUI and JavaFX Projects
Udemy
JSP, Servlet, JSLT + Hibernate: A complete guide
Udemy
Java Spring Framework 6 with Spring Boot 3
Udemy
The Java Spring Tutorial: Learn Java's Popular Web Framework
Udemy