YoVDO

Building Data Intensive Analytic Applications on Top of Delta Lakes

Offered By: Databricks via YouTube

Tags

Data Analysis Courses Apache Spark Courses Data Lakes Courses Data Engineering Courses Delta Lake Courses

Course Description

Overview

Explore the world of data reliability and performance in big data workloads through this 43-minute tutorial on building data-intensive analytic applications with Delta Lake. Learn how Delta Lake, an open-source storage layer, brings ACID transactions to Apache Spark™ and addresses key challenges faced by data engineers. Discover the requirements of modern data engineering and how Delta Lake can improve data reliability at scale. Through presentations, code examples, and interactive notebooks, gain insights into applying this innovation to your data architecture. Understand key data reliability challenges, how Delta Lake fits within an Apache Spark™ environment, and practical ways to implement data reliability improvements. Dive into topics such as data lakes, streaming, schema evolution, and merge operations while exploring hands-on examples using Delta Lake's features.

Syllabus

Introduction
Data Lakes
Typical Data Lake Project
Who uses Delta
Getting started
Data
Download Data
Park Table
Stop Streaming
Initializing Streaming
Working with Parker
Using Delta Lake
Streaming Job
Multiple Streaming Queries
Counting Continuously
Schema Evolution
Merged Schema
Summary
History
Vacuum
Mods
Merge
Update Data
Define DataFrame
Merge Syntax
Random Data
For Each Batch
Summarize
Community
Question
Thank you


Taught by

Databricks

Related Courses

内存数据库管理
openHPI
CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Processing Big Data with Azure Data Lake Analytics
Microsoft via edX
Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera