YoVDO

The Data Lakehouse for Streaming Data - A Talk for Everyone Who Loves Data

Offered By: Devoxx via YouTube

Tags

Devoxx Courses Sentiment Analysis Courses Data Lakes Courses Data Engineering Courses Streaming Data Courses Delta Lake Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the technical aspects of Data Lakehouses and their impact on streaming data in this comprehensive talk. Delve into the fusion of data lakes and data warehouses, examining how open-source projects like Delta Lake enhance data management with ACID transactions, schema enforcement, and efficient metadata handling. Discover the capabilities of open-source solutions for streaming data and gain insights into future improvements. Investigate streaming data analysis, machine learning on the lakehouse, and Project Lightspeed's potential for low-latency Apache Spark Structured Streaming. Witness a live demonstration of Twitter stream ingestion using a declarative, auto-scaling data pipeline for sentiment analysis with Hugging Face. Ideal for data architects, engineers, and practitioners interested in open-source and cloud services, this presentation offers a deep dive into the Databricks Lakehouse and its practical applications.

Syllabus

Intro
About Data Lakhouse
First wow moment
High performance computing
Streaming data
Stream processing
Technical advantages
Misconceptions
Cap Theorem
Summary
Spark Structured Streaming
Project Lightspeed
Streaming ETL
Data Lakhouse
Data Maturity Curve
Two Paradigms
The Lakhouse
How is this working
Google for Lakhouse
Delta Lake
Delta
Does it work
Short answer
Autoloader
Orchestration
Daytona I Summit
Battalion Architecture
The Next Step
Corona
I was sick
I built a simulator
DLT pipeline
Kafka
Faker
Streaming Pipeline
Data Pipeline
Data Analytics
Simple Asset Transaction
Data Warehouse


Taught by

Devoxx

Related Courses

Text Mining and Analytics
University of Illinois at Urbana-Champaign via Coursera
Introduction to Natural Language Processing
University of Michigan via Coursera
Enabling Technologies for Data Science and Analytics: The Internet of Things
Columbia University via edX
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
moocTLH: Nuevos retos en las tecnologĂ­as del lenguaje humano
Universidad de Alicante via MirĂ­adax