YoVDO

Efficiently Stream Data into Your Medallion Architecture with Apache Hudi

Offered By: The ASF via YouTube

Tags

Data Lakes Courses Data Streaming Courses Medallion Architecture Courses Apache Hudi Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover how to efficiently stream data into a medallion architecture using Apache Hudi in this 42-minute conference talk from The ASF. Learn about the challenges of building a medallion architecture with streaming data sources and how Apache Hudi, a transactional data lake platform, addresses these issues. Explore Hudi's new record-level index for faster upsert performance and its database-style change data capture feature. Gain insights into how the record index and incremental processing work in Hudi, and understand how the CDC feature enables incremental processing on the lake. Presented by Ethan Guo, an Apache Hudi committer and Database Engineer at Onehouse, this talk provides valuable knowledge for those interested in optimizing data streaming and Lakehouse architecture.

Syllabus

A glide, skip or a jump: Efficiently stream data into your medallion architecture with Apache Hudi


Taught by

The ASF

Related Courses

History and Evolution of Data Lake Architecture - Post Lambda Architecture
Linux Foundation via YouTube
Transform Your Machine Learning Pipelines with Apache Hudi
Linux Foundation via YouTube
Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube
Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube
Apache XTable - Interoperability Among Lakehouse Table Formats
Databricks via YouTube