Patterns and Operational Insights for Large-Scale Delta Lake Workloads

Offered By: Databricks via YouTube

Course Description

Overview

Explore effective patterns and operational insights from early adopters of Delta Lake in this 42-minute conference talk. Discover how to handle demanding workloads over large volumes of log and telemetry data for cyber threat detection and response. Learn about streaming ETL, data enrichments, analytic workloads, and large materialized aggregates for fast answers. Dive into Z-ordering optimization techniques, including schema design considerations and the 32-column default limit. Understand the implications of date partitioning with long-tail distributions and unsynchronized clocks. Gain insights on optimization strategies, including when to use auto-optimize. Explore upsert patterns that simplify important jobs and learn how to tune Delta Lake for very large tables and low-latency access. Benefit from real-world experiences in operating large-scale workloads on Databricks and Delta Lake, covering topics such as the Parse Framework, merge operations, stateful processing, scaling, schema ordering, partitioning, and handling conflicting transactions.

Syllabus

Introduction
Parse Framework
Merge
Stateful Processing
Merged Tables
Scaling
Schema Ordering
Partitioning
Conflicting transactions
Metadata

Taught by

Databricks

Patterns and Operational Insights for Large-Scale Delta Lake Workloads

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Patterns and Operational Insights for Large-Scale Delta Lake Workloads

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue