Incremental Change Data Capture: A Data-Informed Journey
Offered By: Databricks via YouTube
Course Description
Overview
Embark on a data-informed journey exploring incremental change data capture in this conference talk. Discover how to iterate on incremental ingestion from SaaS applications, relational databases, and event streams into a centralized data lake. Learn to make architecture decisions based on evidence and specific use cases, promoting long-term stewardship and developer happiness. Follow the speaker's experience with sourcing from Salesforce, utilizing Overwatch's insights for load-balancing connectors and achieving significant cost savings. Explore three flavors of CDC, from naive to feature-rich approaches, including batch polling and log streaming. Understand how query-based CDC and Lakehouse Federation can reduce maintenance overload and eliminate bugs. Delve into Liquid Clustering's ability to address data skew across customers and improve write performance. Gain insights on streamlining maintenance and improving reliability with the latest Delta Lake features.
Syllabus
Incremental Change Data Capture: A Data-Informed Journey
Taught by
Databricks
Related Courses
Data Lakes for Big DataEdCast Distributed Computing with Spark SQL
University of California, Davis via Coursera Modernizing Data Lakes and Data Warehouses with Google Cloud
Google Cloud via Coursera Data Engineering with AWS
Udacity Preparing for Google Cloud Certification: Cloud Data Engineer
Google Cloud via Coursera