YoVDO

Apache XTable - Interoperability Among Lakehouse Table Formats

Offered By: Databricks via YouTube

Tags

Cloud Storage Courses Delta Lake Courses Parquet Courses Apache Hudi Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the world of lakehouse table formats in this 36-minute conference talk presented by Dipankar Mazumdar and Kyle Weller from Onehouse. Dive into the challenges of choosing between leading open source projects like Apache Hudi, Delta Lake, and Iceberg, each offering unique features for decoupled storage with transaction and metadata layer primitives. Learn about XTable, an innovative open-source project providing omnidirectional interoperability between table formats without introducing a new format. Discover how XTable's metadata translation abstractions enable writing data in any format and converting it to targets consumable by different compute engines, addressing the challenge of format selection and interoperability in lakehouse workloads. Gain insights into the storage of data in open columnar formats like Parquet, along with metadata for schema, commit history, partitions, and column stats. After the talk, explore additional resources on data lakehouse concepts and fundamentals to deepen your understanding of this evolving field.

Syllabus

Apache XTable (incubating): Interoperability Among Lakehouse Table Formats


Taught by

Databricks

Related Courses

History and Evolution of Data Lake Architecture - Post Lambda Architecture
Linux Foundation via YouTube
Transform Your Machine Learning Pipelines with Apache Hudi
Linux Foundation via YouTube
Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube
Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube
How to Migrate from Snowflake to an Open Data Lakehouse Using Delta Lake UniForm
Databricks via YouTube