YoVDO

Datalake Rock Paper Scissors: Iceberg with Flink or Spark - Performance Comparison

Offered By: Confluent via YouTube

Tags

Apache Spark Courses Apache Kafka Courses Apache Flink Courses Data Lakes Courses Scalability Courses Real-Time Data Processing Courses Data Pipelines Courses Data Ingestion Courses Apache Iceberg Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a conference talk from Current 2023 comparing Apache Flink and Apache Spark for ingesting data from Apache Kafka into an Apache Iceberg datalake. Learn from Bloomberg's experiences as Sitarama Chekuri and Ben de Vera share insights on functionality, performance, fault-tolerance, scaling, and resource utilization of both technologies. Gain valuable knowledge about real-time data pipelines and storage sinks, with a focus on near-real-time speeds. Discover the motivations behind Bloomberg's approach, get an overview of the technologies involved, and examine performance comparisons. Understand how to scale to multiple applications and benefit from the speakers' summary of lessons learned. This 36-minute presentation provides a comprehensive look at datalake architecture choices for organizations using Kafka and Iceberg in their data infrastructure.

Syllabus

- Intro
- Context on Bloomberg and speakers
- Motivation
- Technology overview
- Performance comparison
- Scale to multiple applications
- Summary


Taught by

Confluent

Related Courses

Building Modern Data Streaming Apps with Open Source
Linux Foundation via YouTube
How to Stabilize a GenAI-First Modern Data LakeHouse - Provisioning 20,000 Ephemeral Data Lakes per Year
CNCF [Cloud Native Computing Foundation] via YouTube
Data Storage and Queries
DeepLearning.AI via Coursera
Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube
Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube