How Apache Spark 3.0 and Delta Lake Enhance Data Lake Reliability
Offered By: Databricks via YouTube
Course Description
Overview
Syllabus
Introduction
Who is Danny
Free Download
Databricks
Download the book
Adaptive Query Execution
Apache Spark 30
Performance
Spark Catalyst Optimizer
Logical Physical Planning
Aqe Fundamentals
Broadcast Hash Joins
Why not always broadcast join
Dynamically switch join strategies
Flipping the switch
Off script partitioning
Coalescence
Table Size
Coalescing
Traditional Data Warehousing Problem
Split Partitioning
QA Questions
Dynamic Partition Pruning
Dynamic Partition Pruning Before Optimization
Filter Scan
Results
Pseudo Rush
Building Ecosystem
Data Lake Reliability
Catalog API
SQL Statement Support
Partial Rights
Delete
Delete from Events
History Retention
Data Source v2 Catalog API
Data Quality Framework
Improved Performance
More About Delta
Taught by
Databricks
Related Courses
Big Data EssentialsA Cloud Guru Big Data
University of Adelaide via edX Advanced Data Science with IBM
IBM via Coursera Amazon EMR Getting Started (Indonesian)
Amazon Web Services via AWS Skill Builder Analisar e preparar dados com o Amazon SageMaker Data Wrangler e o Amazon EMR (Português (Brasil)) | Lab - Analyze and Prepare Data with Amazon SageMaker Data Wrangler and Amazon EMR (Portuguese (Brazil))
Amazon Web Services via AWS Skill Builder