YoVDO

Building Data Infrastructure at Scale for AI/ML with Open Data Lakehouses

Offered By: MLOps.community via YouTube

Tags

Data Engineering Courses Apache Spark Courses Apache Kafka Courses Distributed Systems Courses Vector Databases Courses Feature Engineering Courses Real-Time Data Processing Courses Apache Hudi Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore how data lakehouse architecture with Apache Hudi supports real-world predictive ML and vector-based AI use cases in this 30-minute keynote by Vinoth Chandar, creator of Apache Hudi. Learn about ingesting data with minute-level freshness, providing a single source of truth for structured and unstructured data, and utilizing lakehouses for feature engineering, training dataset generation, and production feature creation. Discover the role of lakehouses in GenAI applications, including operating vector generation pipelines at scale and integrating with vector databases for real-time serving. Gain insights from use cases across organizations like NielsenIQ, Notion, and Uber, and understand how data engineers can leverage existing tools or develop new solutions to address AI and ML challenges.

Syllabus

Building Data Infrastructure at Scale for AI/ML with Open Data Lakehouses // Vinoth Chandar // DE4AI


Taught by

MLOps.community

Related Courses

内存数据库管理
openHPI
CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Processing Big Data with Azure Data Lake Analytics
Microsoft via edX
Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera