Data lakes and Lakehouses with Spark and Azure Databricks
Offered By: Udacity
Course Description
Overview
Learn about the big data ecosystem and how to use Spark to work with massive datasets. Learners will also store big data in a data lake and develop Lakehouse architecture on the Azure Databricks platform.
Syllabus
- Course Introduction
- In this lesson, you'll learn about the course, including the prerequisites, tools, environment, and course project.
- Big Data Ecosystem, Data Lakes, and Spark
- In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.
- Data Wrangling with Spark
- In this lesson, we'll dive into how to use Spark for cleaning and aggregating data.
- Spark Debugging and Optimization
- In this lesson, you will learn best practices for debugging and optimizing your Spark applications.
- Azure Databricks
- In this lesson, you'll create Spark Clusters and Spark code on the Azure Databricks platform.
- Data Lakes and Lakehouse with Azure Databricks
- In this lesson, you'll create data lakes and Lakehouse architecture on the Azure Databricks platform
- Building an Azure Data Lake for Bike Share Data Analytics
- In this project, you'll implement Lakehouse architecture on the Azure Databricks platform.
Taught by
Matt Swaffer
Related Courses
Data Processing with AzureLearnQuest via Coursera Mejores prácticas para el procesamiento de datos en Big Data
Coursera Project Network via Coursera Data Science with Databricks for Data Analysts
Databricks via Coursera Azure Data Engineer con Databricks y Azure Data Factory
Coursera Project Network via Coursera Curso Completo de Spark con Databricks (Big Data)
Coursera Project Network via Coursera