YoVDO

Implement a data engineering solution with Azure Databricks

Offered By: Microsoft via Microsoft Learn

Tags

Git Courses Apache Spark Courses CI/CD Courses Data Engineering Courses Delta Lake Courses Azure Databricks Courses Event-Driven Architecture Courses Spark Structured Streaming Courses Delta Live Tables Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
  • Module 1: Learn about spark structured streaming and ways to optimize and use it to populate destination objects

    At the end of this module, you're able to:

    • Understand Spark structured streaming.
    • Some techniques to optimize structured streaming.
    • How to handle late arriving or out of order events.
    • How to set up real-time-sources for incremental processing.
  • Module 2: Learn about structured streaming with Delta Live tables

    At the end of this module, you're able to:

    • Use Event driven architectures with Delta Live tables
    • Ingest streaming data
    • Achieve Data consistency and reliability
    • Scale streaming workloads with Delta Live tables
  • Module 3: Optimize performance with Spark and Delta Live Tables in Azure Databricks.

    In this module, you learn how to:

    • Use serverless compute and parallelism with Delta live tables
    • Perform cost based optimization and query performance
    • Use Change Data Capture (CDC)
    • Apply enhanced autoscaling capabilities
    • Implement Observability and enhance data quality metrics
  • Module 4: Implement CI/CD workflows in Azure Databricks

    In this module, you learn how to:

    • Implement version control and Git integration.
    • Perform unit testing and integration testing.
    • Maintain environment and configuration management.
    • Implement rollback and roll-forward strategies.

Syllabus

  • Module 1: Module 1: Perform incremental processing with spark structured streaming
    • Introduction
    • Set up real-time data sources for incremental processing
    • Optimize Delta Lake for incremental processing in Azure Databricks
    • Handle late data and out-of-order events in incremental processing
    • Monitoring and performance tuning strategies for incremental processing in Azure Databricks
    • Exercise - Real-time ingestion and processing with Delta Live Tables with Azure Databricks
    • Knowledge check
    • Summary
  • Module 2: Module 2: Implement streaming architecture patterns with Delta Live tables
    • Introduction
    • Event driven architectures with Delta Live tables
    • Ingest data with structured streaming
    • Maintain data consistency and reliability with structured streaming
    • Scale streaming workloads with Delta Live tables
    • Exercise - end-to-end streaming pipeline with Delta Live tables
    • Knowledge check
    • Summary
  • Module 3: Module 3: Optimize performance with Spark and Delta Live Tables
    • Introduction
    • Optimize performance with Spark and Delta Live Tables
    • Perform cost-based optimization and query tuning
    • Use change data capture (CDC)
    • Use enhanced autoscaling
    • Implement observability and data quality metrics
    • Exercise - optimize data pipelines for better performance in Azure Databricks
    • Knowledge check
    • Summary
  • Module 4: Module 4: Implement CI/CD workflows in Azure Databricks
    • Introduction
    • Implement version control and Git integration
    • Perform unit testing and integration testing
    • Manage and configure your environment
    • Implement rollback and roll-forward strategies
    • Exercise - Implement CI/CD workflows
    • Knowledge check
    • Summary

Tags

Related Courses

Azure Data Engineer con Databricks y Azure Data Factory
Coursera Project Network via Coursera
Operationalizing Microsoft Azure AI Solutions
Pluralsight
Building Your First ETL Pipeline Using Azure Databricks
Pluralsight
Implementing an Azure Databricks Environment in Microsoft Azure
Pluralsight
Building Batch Data Processing Solutions in Microsoft Azure
Pluralsight