YoVDO

Apache Paimon Stream Data Lake: CDC Feed and Stream Read

Offered By: The ASF via YouTube

Tags

Apache Spark Courses Apache Flink Courses Data Ingestion Courses Trino Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the capabilities of Apache Paimon (incubating), a cutting-edge streaming data lake storage technology, in this 36-minute conference talk by Li Jinsong, an Alibaba Senior technical specialist and PMC member of Apache Flink. Dive into the world of high-throughput, low-latency data intake, streaming subscriptions, and real-time query functionalities. Discover how Paimon's open data format and technology concept seamlessly integrate with leading computing engines like Apache Flink, Spark, and Trino. Learn about key features including CDC Schema Evolution into lake, CDC entire vault into the lake, CDC into the lake part of the column update, and real-time change log stream reading. Gain valuable insights into the future of flow lake storage technology from an industry expert who has extensive experience in distributed flow computing, distributed batch computing, and lake storage.

Syllabus

Apache Paimon Stream Data Lake: Cdc Feed Lake And Stream Read


Taught by

The ASF

Related Courses

Delta Lake 2.0 Overview - New Features and Community Collaborations
Databricks via YouTube
Why Lakehouse Architecture Now - Exploring Enterprise Data Warehouse Failures and the Need for Lakehouse Paradigm
Databricks via YouTube
Scaling Climate Data for FinTech with an Open Source Data Mesh
Linux Foundation via YouTube
Getting Started with Delta Lake
Linux Foundation via YouTube
ETL - Extract Trino Load: A Case for Trino as a Batch Processing Engine
Linux Foundation via YouTube