Impala Performance on Iceberg Tables - Optimizations and Benchmarks
Offered By: The ASF via YouTube
Course Description
Overview
Explore the performance of Apache Impala on Iceberg tables in this 27-minute conference talk from The ASF. Dive into the implementation details of Impala's optimized C++ approach for reading Iceberg tables, contrasting it with other engines that rely on the Iceberg library. Discover how Impala efficiently handles delete files in Iceberg tables, implementing new Iceberg-specific operators for improved query performance. Gain insights into Impala's architecture, Iceberg's structure, and the performance enhancements specifically designed for Iceberg integration. Compare Impala's performance against other open-source query engines through detailed measurements. By the end, acquire a high-level understanding of both Impala and Iceberg architectures, along with Impala's competitive edge in querying Iceberg tables with position delete files.
Syllabus
Let’s see how fast Impala runs on Iceberg
Taught by
The ASF
Related Courses
Building Modern Data Streaming Apps with Open SourceLinux Foundation via YouTube How to Stabilize a GenAI-First Modern Data LakeHouse - Provisioning 20,000 Ephemeral Data Lakes per Year
CNCF [Cloud Native Computing Foundation] via YouTube Data Storage and Queries
DeepLearning.AI via Coursera Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube