Apache XTable - Interoperability Among Lakehouse Table Formats
Offered By: Databricks via YouTube
Course Description
Overview
Explore the world of lakehouse table formats in this 36-minute conference talk presented by Dipankar Mazumdar and Kyle Weller from Onehouse. Dive into the challenges of choosing between leading open source projects like Apache Hudi, Delta Lake, and Iceberg, each offering unique features for decoupled storage with transaction and metadata layer primitives. Learn about XTable, an innovative open-source project providing omnidirectional interoperability between table formats without introducing a new format. Discover how XTable's metadata translation abstractions enable writing data in any format and converting it to targets consumable by different compute engines, addressing the challenge of format selection and interoperability in lakehouse workloads. Gain insights into the storage of data in open columnar formats like Parquet, along with metadata for schema, commit history, partitions, and column stats. After the talk, explore additional resources on data lakehouse concepts and fundamentals to deepen your understanding of this evolving field.
Syllabus
Apache XTable (incubating): Interoperability Among Lakehouse Table Formats
Taught by
Databricks
Related Courses
History and Evolution of Data Lake Architecture - Post Lambda ArchitectureLinux Foundation via YouTube Transform Your Machine Learning Pipelines with Apache Hudi
Linux Foundation via YouTube Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube How to Migrate from Snowflake to an Open Data Lakehouse Using Delta Lake UniForm
Databricks via YouTube