YoVDO

The Data Lakehouse - A Tech Talk for Everyone Who Loves Data

Offered By: Devoxx via YouTube

Tags

Devoxx Courses Machine Learning Courses Governance Courses Data Lakes Courses Delta Lake Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the technical aspects of Data Lakehouses in this 45-minute tech talk from Devoxx. Delve into the combination of data lakes and data warehouses, understanding how they deliver reliability, governance, and performance while maintaining openness, flexibility, and machine learning support. Examine open-source projects like Delta Lake that transform data lakes into lakehouses, introducing ACID transactions, schema enforcement, and time travel capabilities. Focus on streaming data applications, investigating open-source table formats, data ingestion, pipelines, quality, workflows, analysis, and machine learning. Gain insights into Project Lightspeed and its impact on Apache Spark Structured Streaming. Witness live code demonstrations, including a Twitter stream sentiment analysis using Hugging Face. Ideal for data architects, engineers, and practitioners interested in open-source solutions and cloud services, with demonstrations using the Databricks Lakehouse.

Syllabus

Intro
Favorite thing about Apache Spark
Structured streaming
The Data Lakehouse
How does it work
Does it perform
Streaming
Data pipelines
Data donation project
Streaming in Kafka


Taught by

Devoxx

Related Courses

Startup Boards: Advanced Entrepreneurship
Stanford University via NovoEd
The European Union in Global Governance
iversity
Public Privacy: Cyber Security & Human Rights
Humboldt-Viadrina School of Governance via iversity
Villes africaines I: Introduction à la planification urbaine
École Polytechnique Fédérale de Lausanne via Coursera
Leadership in 21st Century Organizations
Copenhagen Business School via Coursera