YoVDO

Building the Petcare Data Platform with Delta Lake and Spark ETL Pipeline

Offered By: Databricks via YouTube

Tags

Data Engineering Courses Cloud Computing Courses Microsoft Azure Courses Apache Spark Courses Databricks Courses Data Lakes Courses Data Analytics Courses Data Pipelines Courses Delta Lake Courses ETL Courses

Course Description

Overview

Explore the development of Mars Petcare's cloud-based Data Lake solution, the Petcare Data Platform, in this 27-minute conference talk. Learn how the Kinship Data & Analytics division leveraged Microsoft Azure, Delta Lake, and Databricks to create 'Kyte', a custom Spark ETL pipeline tool. Discover the advantages of migrating from Azure Data Factory to a Spark-heavy ETL design and Delta Lake-driven platform. Gain insights into using Delta Lake for ETL configurations and the creation of a bespoke UI for monitoring and scheduling Spark pipelines. Understand the benefits of this approach in supporting Mars Petcare's mission of making a better world for pets, including how Delta Lake is utilized to expose data to Data Scientists and the advantages of a Databricks & Spark ETL solution over Azure Data Factory.

Syllabus

Introduction
Our Data Platform
Our ETL Framework
Schema Revolution
Git Integration
Benefits of Delta Lake
Deployment


Taught by

Databricks

Related Courses

Data Processing with Azure
LearnQuest via Coursera
Mejores prácticas para el procesamiento de datos en Big Data
Coursera Project Network via Coursera
Data Science with Databricks for Data Analysts
Databricks via Coursera
Azure Data Engineer con Databricks y Azure Data Factory
Coursera Project Network via Coursera
Curso Completo de Spark con Databricks (Big Data)
Coursera Project Network via Coursera