Multi-Table Transactions with LakeFS and Delta Lake - Tech Talk
Offered By: Databricks via YouTube
Course Description
Overview
Explore multi-table transactions with LakeFS and Delta Lake in this 45-minute tech talk recording from Databricks. Learn how LakeFS enables collaborative data lake management and CI/CD deployment of data, while Delta Lake facilitates building a Lakehouse architecture on various storage systems. Discover the integration of these technologies to simplify multi-table pipelines. Gain insights from speakers Paul Singman of Treeverse and Denny Lee of Databricks as they discuss typical data lake projects, challenges, and solutions. Follow along with demonstrations of the LakeFS UI, repository setup, DataFrame operations, and merging techniques. Understand concepts like single table consistency, validation notebooks, Git integration, and concurrent operations. Enhance your knowledge of modern data lake management and Lakehouse architectures through this informative presentation.
Syllabus
Intro
Presentation
Typical Data Lake Project
Data Lake Problems
Single Table Consistency
MultiTable Transactions
Demo
LakeFS UI
Demo Overview
Demo Repository
DataFrame
Second Table
Commit
Merge
Merge Failed
Retry Merge
Validation Notebook
Questions
Git Integration
Concurrent Operations
Summary
Taught by
Databricks
Related Courses
CI/CD for Data - Building Dev/Test Data Environments with Open Source StacksCNCF [Cloud Native Computing Foundation] via YouTube Building Reproducible ML Processes with an Open Source Stack
Linux Foundation via YouTube Power Up Your Lakehouse with Git Semantics and Delta Lake
Databricks via YouTube Version Control for Lakehouse Architecture - Essential Practices and Benefits
Databricks via YouTube Developing Data Pipelines with Branch Deployments - A New Approach
Databricks via YouTube