Data lakes and Lakehouses with Spark and Azure Databricks

Offered By: Udacity

Tags

Microsoft Azure Courses Cloud Computing Courses Databricks Courses Data Lakes Courses Data Wrangling Courses Data Engineering Courses Azure Databricks Courses

Course Description

Overview

Learn about the big data ecosystem and how to use Spark to work with massive datasets. Learners will also store big data in a data lake and develop Lakehouse architecture on the Azure Databricks platform.

Syllabus

Course Introduction

In this lesson, you'll learn about the course, including the prerequisites, tools, environment, and course project.

Big Data Ecosystem, Data Lakes, and Spark

In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.

Data Wrangling with Spark

In this lesson, we'll dive into how to use Spark for cleaning and aggregating data.

Spark Debugging and Optimization

In this lesson, you will learn best practices for debugging and optimizing your Spark applications.

Azure Databricks

In this lesson, you'll create Spark Clusters and Spark code on the Azure Databricks platform.

Data Lakes and Lakehouse with Azure Databricks

In this lesson, you'll create data lakes and Lakehouse architecture on the Azure Databricks platform

Building an Azure Data Lake for Bike Share Data Analytics

In this project, you'll implement Lakehouse architecture on the Azure Databricks platform.

Taught by

Matt Swaffer

Related Courses

Software as a Service
University of California, Berkeley via Coursera Software Defined Networking
Georgia Institute of Technology via Coursera Pattern-Oriented Software Architectures: Programming Mobile Services for Android Handheld Systems
Vanderbilt University via Coursera Web-Technologien
openHPI Données et services numériques, dans le nuage et ailleurs
Certificat informatique et internet via France Université Numerique