Data lakes and Lakehouses with Spark and Azure Databricks
Offered By: Udacity
Course Description
Overview
Learn about the big data ecosystem and how to use Spark to work with massive datasets. Learners will also store big data in a data lake and develop Lakehouse architecture on the Azure Databricks platform.
Syllabus
- Course Introduction
- In this lesson, you'll learn about the course, including the prerequisites, tools, environment, and course project.
- Big Data Ecosystem, Data Lakes, and Spark
- In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.
- Data Wrangling with Spark
- In this lesson, we'll dive into how to use Spark for cleaning and aggregating data.
- Spark Debugging and Optimization
- In this lesson, you will learn best practices for debugging and optimizing your Spark applications.
- Azure Databricks
- In this lesson, you'll create Spark Clusters and Spark code on the Azure Databricks platform.
- Data Lakes and Lakehouse with Azure Databricks
- In this lesson, you'll create data lakes and Lakehouse architecture on the Azure Databricks platform
- Building an Azure Data Lake for Bike Share Data Analytics
- In this project, you'll implement Lakehouse architecture on the Azure Databricks platform.
Taught by
Matt Swaffer
Related Courses
Building Cloud Apps with Microsoft Azure - Part 1 (self-paced)Microsoft via edX Building Cloud Apps with Microsoft Azure - Part 3
Microsoft via edX DEV202.2x: Building Cloud Apps with Microsoft Azure – Part 2
Microsoft via edX Architecting Microsoft Azure Solutions
Microsoft via edX Implementing Predictive Analytics with Spark in Azure HDInsight
Microsoft via edX