YoVDO

How to Stabilize a GenAI-First Modern Data LakeHouse - Provisioning 20,000 Ephemeral Data Lakes per Year

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

SQL Courses Kubernetes Courses Scalability Courses OpenTelemetry Courses Apache Iceberg Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore strategies for stabilizing a GenAI-first modern data lakehouse in this 32-minute conference talk from the Cloud Native Computing Foundation (CNCF). Learn how LinkedIn tackled challenges in scaling their exabyte-scale data lake while introducing GenAI LLMs, migrating to Iceberg, and starting their object storage journey. Discover approaches to maintain platform stability without compromising innovation, focusing on AI and unified SQL. Gain insights into a low-latency system for auto-building lightweight data lakes on Kubernetes for every code commit and PR, and learn how to scale flow failure insights using OpenTelemetry and the JVM. Understand how these techniques enabled the provisioning of over 20,000 ephemeral data lakes per year, catching 2,100 platform issues in the process.

Syllabus

How to Stabilize a GenAI-First, Modern Data LakeHouse: Provision 20,000 Ephemeral Data Lakes/Year


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Building Modern Data Streaming Apps with Open Source
Linux Foundation via YouTube
Data Storage and Queries
DeepLearning.AI via Coursera
Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube
Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube
Capital One's Data Innovation Strategy - You Build, Your Data (YBYD)
Databricks via YouTube