YoVDO

Delta Live Tables: Modern Software Engineering for ETL Pipelines

Offered By: Databricks via YouTube

Tags

ETL Courses Software Engineering Courses Data Transformation Courses Stream Processing Courses Batch Processing Courses Delta Live Tables Courses

Course Description

Overview

Explore a comprehensive talk on Delta Live Tables (DLT), a revolutionary ETL framework that simplifies data transformation and pipeline management. Learn how DLT incorporates modern software engineering practices to deliver reliable and trusted data pipelines at scale. Discover techniques for rapid innovation in pipeline development and maintenance, automation of administrative tasks, and improved visibility into pipeline operations. Gain insights into built-in quality controls and monitoring for accurate BI, data science, and ML. Understand how to implement simplified batch and streaming with self-optimizing and auto-scaling data pipelines. Delve into topics such as live table dependencies, pipeline boundaries, SQL expectations, Python integration, metaprogramming, event log integration, failure recovery automation, Spark Structured Streaming for ingestion, and Delta for infinite retention.

Syllabus

Intro
Life as a data professional.
What is a Live Table?
Development vs Production
Declare LIVE Dependencies
Choosing pipeline boundaries
Pitfall: hard-code sources & destinations
Ensure correctness with Expectations
Expectations using the power of SQL
Using Python
Installing libraries with pip
Metaprogramming in python
Best Practice: Integrate using the event log
DLT Automates Failure Recovery
What is SparkTM Structured Streaming?
Using Spark Structured Streaming for ingestion
Use Delta for infinite retention
Partition recomputation


Taught by

Databricks

Related Courses

Building Batch Data Pipelines on GCP auf Deutsch
Google Cloud via Coursera
Building Batch Data Pipelines on GCP en Français
Google Cloud via Coursera
Mastering Azure Data Factory: From Basics to Advanced Level
Udemy
Data Science de A a Z - Extraçao e Exibição dos Dados
Udemy
Building Batch Data Processing Solutions in Microsoft Azure
Pluralsight