Implement a data engineering solution with Azure Databricks
Offered By: Microsoft via Microsoft Learn
Course Description
Overview
- Module 1: Learn about spark structured streaming and ways to optimize and use it to populate destination objectsAt the end of this module, you're able to: - Understand Spark structured streaming.
- Some techniques to optimize structured streaming.
- How to handle late arriving or out of order events.
- How to set up real-time-sources for incremental processing.
 
- Module 2: Learn about structured streaming with Delta Live tablesAt the end of this module, you're able to: - Use Event driven architectures with Delta Live tables
- Ingest streaming data
- Achieve Data consistency and reliability
- Scale streaming workloads with Delta Live tables
 
- Module 3: Optimize performance with Spark and Delta Live Tables in Azure Databricks.In this module, you learn how to: - Use serverless compute and parallelism with Delta live tables
- Perform cost based optimization and query performance
- Use Change Data Capture (CDC)
- Apply enhanced autoscaling capabilities
- Implement Observability and enhance data quality metrics
 
- Module 4: Implement CI/CD workflows in Azure DatabricksIn this module, you learn how to: - Implement version control and Git integration.
- Perform unit testing and integration testing.
- Maintain environment and configuration management.
- Implement rollback and roll-forward strategies.
 
Syllabus
- Module 1: Module 1: Perform incremental processing with spark structured streaming- Introduction
- Set up real-time data sources for incremental processing
- Optimize Delta Lake for incremental processing in Azure Databricks
- Handle late data and out-of-order events in incremental processing
- Monitoring and performance tuning strategies for incremental processing in Azure Databricks
- Exercise - Real-time ingestion and processing with Delta Live Tables with Azure Databricks
- Knowledge check
- Summary
 
- Module 2: Module 2: Implement streaming architecture patterns with Delta Live tables- Introduction
- Event driven architectures with Delta Live tables
- Ingest data with structured streaming
- Maintain data consistency and reliability with structured streaming
- Scale streaming workloads with Delta Live tables
- Exercise - end-to-end streaming pipeline with Delta Live tables
- Knowledge check
- Summary
 
- Module 3: Module 3: Optimize performance with Spark and Delta Live Tables- Introduction
- Optimize performance with Spark and Delta Live Tables
- Perform cost-based optimization and query tuning
- Use change data capture (CDC)
- Use enhanced autoscaling
- Implement observability and data quality metrics
- Exercise - optimize data pipelines for better performance in Azure Databricks
- Knowledge check
- Summary
 
- Module 4: Module 4: Implement CI/CD workflows in Azure Databricks- Introduction
- Implement version control and Git integration
- Perform unit testing and integration testing
- Manage and configure your environment
- Implement rollback and roll-forward strategies
- Exercise - Implement CI/CD workflows
- Knowledge check
- Summary
 
Tags
Related Courses
Next Steps in SAP HANA Cloud PlatformSAP Learning How to Use Git and GitHub
Udacity Accediendo a la nube con iOS
Tecnológico de Monterrey via Coursera Python for Data Science
University of California, San Diego via edX Version Control with Git
Udacity
