Building Robust Data Pipelines for Modern Data Engineering - End-to-End Project
Offered By: CodeWithYu via YouTube
Course Description
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Embark on a comprehensive end-to-end data engineering project in this nearly two-hour video tutorial. Learn to build robust data pipelines using Apache Spark, Azure Databricks, and Data Build Tool (DBT) with Azure as the cloud provider. Follow along as the instructor guides you through data ingestion into a lakehouse, data integration with Azure Data Factory, and data transformation using Databricks and DBT. Gain hands-on experience setting up resource groups, implementing medallion architecture, configuring Azure Key Vault for secure secret management, and orchestrating data pipelines. Explore the integration of Azure Databricks with Key Vault and Data Factory, and dive into DBT setup, configuration, and advanced features like snapshots and data marts. By the end of this tutorial, you'll have a solid understanding of modern data engineering practices and be equipped to build scalable, efficient data pipelines in the cloud.
Syllabus
Introduction
System Architecture
Creating resource groups on Azure
Setting up the medallion architecture storage account
Setting up Azure Data Factory
Azure Key Vault setup for secrets
Azure database with automatic data population
Azure Data Factory pipeline orchestration
Setting up Databricks
Azure Databricks Secret Scope and Key Vault
Verifying Databricks - Key Vault - Secret Scope Integration
Azure Data Factory - Databricks Integration
DBT Setup
DBT Configuration with Azure Databricks
DBT Snapshots with Azure Databricks and ADLS Gen2
DBT Data Marts with Azure Databricks and ADLS Gen2
DBT Documentation
Outro
Taught by
CodeWithYu
Related Courses
Hands-On with DataflowA Cloud Guru Azure Data Engineer con Databricks y Azure Data Factory
Coursera Project Network via Coursera Data Integration with Microsoft Azure Data Factory
Microsoft via Coursera Azure Data Factory : Implement SCD Type 1
Coursera Project Network via Coursera MLOps1 (Azure): Deploying AI & ML Models in Production using Microsoft Azure Machine Learning
statistics.com via edX