YoVDO

DBT Using Databricks and Delta

Offered By: Databricks via YouTube

Tags

dbt (Data build tool) Courses SQL Courses Databricks Courses Data Lakes Courses Data Processing Courses Observability Courses Azure Cloud Courses

Course Description

Overview

Explore a comprehensive 26-minute talk on integrating Data Build Tool (DBT) with Databricks and Delta for efficient data lake management. Learn how this open-source, SQL-first technology enhances data quality and documentation throughout the data lake lifecycle. Discover the basics of DBT and its synergy with Databricks for powerful data processing. Examine how DBT supports Delta to enable SQL-based upserts. Investigate the integration of DBT and Databricks within the Azure cloud environment. Gain insights into emitting pipeline metrics to Azure Monitor for improved observability. Dive into topics such as DBT as a SQL runner and compiler, documentation generation, testing, incremental ingestion, DBT macros, and the use of Hive UDFs. Master the art of maintaining high-quality data pipelines using software engineering best practices.

Syllabus

Intro
GoDataDriven
Data Build Tool
SOL with some Ninja2 sauce
DBT as a SOL Runner
DBT as a SOL Compiler
Next to the SOL there is documentation
dbt docs generate dbt docs serve
Testing
How does DBT communicate with Spark?
Switch to incremental ingestion
Switch to incremental Delta
In practice
DBT Macro's
Observability is king
Very simple Hive UDF
Small snippet of Scala
Use the UDF in DBT
Be proactive
Feedback


Taught by

Databricks

Related Courses

Data Modeling, Transformation, and Serving
DeepLearning.AI via Coursera
Introduction to dbt
DataCamp
Advance Your Data Engineering Skills
LinkedIn Learning
Data Engineering: dbt for SQL
LinkedIn Learning
Data Engineering Hands-On Practice
LinkedIn Learning