YoVDO

Building Robust Data Pipelines for Modern Data Engineering - End-to-End Project

Offered By: CodeWithYu via YouTube

Tags

Data Engineering Courses Apache Spark Courses Azure Storage Courses Azure Data Factory Courses Azure Key Vault Courses Data Pipelines Courses Azure Databricks Courses Medallion Architecture Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Embark on a comprehensive end-to-end data engineering project in this nearly two-hour video tutorial. Learn to build robust data pipelines using Apache Spark, Azure Databricks, and Data Build Tool (DBT) with Azure as the cloud provider. Follow along as the instructor guides you through data ingestion into a lakehouse, data integration with Azure Data Factory, and data transformation using Databricks and DBT. Gain hands-on experience setting up resource groups, implementing medallion architecture, configuring Azure Key Vault for secure secret management, and orchestrating data pipelines. Explore the integration of Azure Databricks with Key Vault and Data Factory, and dive into DBT setup, configuration, and advanced features like snapshots and data marts. By the end of this tutorial, you'll have a solid understanding of modern data engineering practices and be equipped to build scalable, efficient data pipelines in the cloud.

Syllabus

Introduction
System Architecture
Creating resource groups on Azure
Setting up the medallion architecture storage account
Setting up Azure Data Factory
Azure Key Vault setup for secrets
Azure database with automatic data population
Azure Data Factory pipeline orchestration
Setting up Databricks
Azure Databricks Secret Scope and Key Vault
Verifying Databricks - Key Vault - Secret Scope Integration
Azure Data Factory - Databricks Integration
DBT Setup
DBT Configuration with Azure Databricks
DBT Snapshots with Azure Databricks and ADLS Gen2
DBT Data Marts with Azure Databricks and ADLS Gen2
DBT Documentation
Outro


Taught by

CodeWithYu

Related Courses

内存数据库管理
openHPI
CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Processing Big Data with Azure Data Lake Analytics
Microsoft via edX
Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera