How to Stabilize a GenAI-First Modern Data LakeHouse - Provisioning 20,000 Ephemeral Data Lakes per Year
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore strategies for stabilizing a GenAI-first modern data lakehouse in this 32-minute conference talk from the Cloud Native Computing Foundation (CNCF). Learn how LinkedIn tackled challenges in scaling their exabyte-scale data lake while introducing GenAI LLMs, migrating to Iceberg, and starting their object storage journey. Discover approaches to maintain platform stability without compromising innovation, focusing on AI and unified SQL. Gain insights into a low-latency system for auto-building lightweight data lakes on Kubernetes for every code commit and PR, and learn how to scale flow failure insights using OpenTelemetry and the JVM. Understand how these techniques enabled the provisioning of over 20,000 ephemeral data lakes per year, catching 2,100 platform issues in the process.
Syllabus
How to Stabilize a GenAI-First, Modern Data LakeHouse: Provision 20,000 Ephemeral Data Lakes/Year
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Building Modern Data Streaming Apps with Open SourceLinux Foundation via YouTube Data Storage and Queries
DeepLearning.AI via Coursera Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube Capital One's Data Innovation Strategy - You Build, Your Data (YBYD)
Databricks via YouTube