Transform Your Machine Learning Pipelines with Apache Hudi
Offered By: Linux Foundation via YouTube
Course Description
Overview
Discover how to revolutionize machine learning pipelines integrated with data lakes in this 25-minute conference talk by Nadine Farah from Onehouse. Learn about the challenges of maintaining fresh, accurate, and near real-time data for ML models in traditional data lakes. Explore how Apache Hudi addresses these issues with features like upserts, incremental processing, and near real-time access. Gain insights into building efficient ML pipelines using Hudi's capabilities, including time-travel querying and incremental data pulls. Understand how to overcome data latency, implement incremental updates, and ensure timely data availability for ML models. By the end of this talk, acquire knowledge on transforming your ML pipelines to harness the full potential of data lakes using Apache Hudi.
Syllabus
Unveil the Magic Without Hoodini: Transform Your Machine Learning Pipelines with Apa... Nadine Farah
Taught by
Linux Foundation
Tags
Related Courses
History and Evolution of Data Lake Architecture - Post Lambda ArchitectureLinux Foundation via YouTube Delivering Portability to Open Data Lakes with Delta Lake UniForm
Databricks via YouTube Fast Copy-On-Write in Apache Parquet for Data Lakehouse Upserts
Databricks via YouTube Apache XTable - Interoperability Among Lakehouse Table Formats
Databricks via YouTube How to Migrate from Snowflake to an Open Data Lakehouse Using Delta Lake UniForm
Databricks via YouTube