Why Lakehouse Architecture Now - Exploring Enterprise Data Warehouse Failures and the Need for Lakehouse Paradigm
Offered By: Databricks via YouTube
Course Description
Overview
Explore the failures of enterprise data warehouse paradigms and discover the need for lakehouse architecture in this 26-minute talk by Databricks. Delve into Starburst's approach to powering SQL-based interactive analytics on Delta Lake, the foundation for lakehouse architecture. Learn about the problems with today's data management approaches, including excessive copying and moving of data. Understand the benefits of the new lakehouse architecture, including optionality and alignment with Gartner's Data Sharing Model for accelerating digital business. Gain insights into the underlying technology, including the reasons for choosing Data Lake and Data Lakehouse, the role of Trino, and Starburst's Native Delta Lake Reader. Examine data flow diagrams, ingestion and transformation processes, and explore capabilities beyond basic SELECT operations.
Syllabus
Intro
About Starburst
Today's approach: Too much copying & moving
The problem
New Lakehouse Architecture
Lakehouse architecture provides Optionality
The Gartner Data Sharing Model To accelerate digital business
Under the Hood
First: Why Data Lake?
Second: Why Data Lakehouse?
What is Trino?
Starburst's Native Delta Lake Reader
Delta Lake Reader Performance
Data Flow Diagram
Data Ingestion and Transformation
Beyond SELECT
Taught by
Databricks
Related Courses
Distributed Computing with Spark SQLUniversity of California, Davis via Coursera Apache Spark (TM) SQL for Data Analysts
Databricks via Coursera Building Your First ETL Pipeline Using Azure Databricks
Pluralsight Implement a data lakehouse analytics solution with Azure Databricks
Microsoft via Microsoft Learn Perform data science with Azure Databricks
Microsoft via Microsoft Learn