Apache Impala: Reading, Modifying, and Optimizing Iceberg Tables
Offered By: The ASF via YouTube
Course Description
Overview
Discover how Apache Impala has evolved to meet modern data warehouse requirements in this 26-minute conference talk from The Apache Software Foundation. Learn about Impala's new capabilities for reading, modifying, and optimizing Apache Iceberg tables, including row-level modifications and table maintenance features. Explore how Impala now supports RDBMS-like functionalities, such as compliance with GDPR and CCPA regulations through record removal and updates. Understand the benefits of the OPTIMIZE statement for merging small data files and eliminating delete files to maintain table health. Gain insights into the DROP PARTITION statement for selective partition removal based on predicates. Presented by Cloudera engineers Zoltán Borók-Nagy, Péter Rózsa, and Noémi Pap-Takács, this talk demonstrates how Impala has adapted to emerging requirements while maintaining its focus on performance in distributed, massively parallel query execution for big data.
Syllabus
This Impala not only reads, but modifies and optimizes Iceberg tables
Taught by
The ASF
Related Courses
Building Modern Data Streaming Apps with Open SourceLinux Foundation via YouTube How to Stabilize a GenAI-First Modern Data LakeHouse - Provisioning 20,000 Ephemeral Data Lakes per Year
CNCF [Cloud Native Computing Foundation] via YouTube Data Storage and Queries
DeepLearning.AI via Coursera Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube