YoVDO

Delta Lake 2.0 Overview - New Features and Community Collaborations

Offered By: Databricks via YouTube

Tags

Delta Lake Courses Data Warehousing Courses Apache Spark Courses Apache Flink Courses Presto Courses Apache Pulsar Courses Trino Courses

Course Description

Overview

Explore the latest features and integrations of Delta Lake 2.0 in this 38-minute video presentation by Databricks. Dive into the collaborative efforts of the Delta community that led to this significant release, including integrations with Apache Sparkā„¢, Apache Flink, Apache Pulsar, Presto, and Trino. Learn about advanced features such as OPTIMIZE ZORDER, data skipping using column stats, S3 multi-cluster writes, and Change Data Feed. Discover the expanded language support with APIs for Rust, Python, Ruby, GoLang, Scala, and Java. Gain insights into the three phases of Delta Lake's development, understand the motivations behind new features like Change Data Feed and Column Mapping, and explore solutions to challenges in multi-cluster writes on S3. Examine the Delta Source for Flink, Delta connector for Trino/Presto, and the introduction of Delta Standalone. Get an overview of multiple Delta projects and repositories in this comprehensive update on Delta Lake 2.0.

Syllabus

Intro
Three phases of Delta Lake (abridged)
What is in Delta 2.0.0?
Data skipping via column stats
Change Data Feed: Motivation
Change Data Feed: Problem
Change Data Feed: Solution
Column Mapping: Problem
Column Mapping Solution
Multi-cluster writes on S3: Problem
Multi-cluster writes on S3: Solution
Flink: Delta Source
Trino / Presto: Delta connector
Delta Standalone
Multiple Delta projects and repositories


Taught by

Databricks

Related Courses

Why Lakehouse Architecture Now - Exploring Enterprise Data Warehouse Failures and the Need for Lakehouse Paradigm
Databricks via YouTube
Scaling Climate Data for FinTech with an Open Source Data Mesh
Linux Foundation via YouTube
Getting Started with Delta Lake
Linux Foundation via YouTube
ETL - Extract Trino Load: A Case for Trino as a Batch Processing Engine
Linux Foundation via YouTube
Consuming Legend Data Models and Services Using BI, Python/ML and Other Tools
Linux Foundation via YouTube