YoVDO

Building a Multimodal Data Lakehouse with the Daft Distributed Python Dataframe

Offered By: Databricks via YouTube

Tags

Data Engineering Courses Python Courses Databricks Courses Distributed Computing Courses DataFrames Courses Delta Lake Courses ETL Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore how to build a multimodal data lakehouse using Daft, a next-generation distributed query engine, in this 18-minute conference talk. Learn about processing diverse data types including numbers, strings, JSONs, images, and PDFs at scale using a familiar dataframe interface. Discover how Daft simplifies large-scale ETL processes, eliminating the need for bespoke data pipelines and custom tooling. See a demonstration of integrating Daft with existing infrastructure like S3, DeltaLake, Databricks, and Spark to create a powerful and flexible data processing solution. Gain insights from Jay Chia, Co-Founder of Eventual Computing, on leveraging Daft's Python and Rust-based architecture for efficient multimodal data handling in modern data workloads.

Syllabus

Building a Multimodal Data Lakehouse with the Daft Distributed Python Dataframe


Taught by

Databricks

Related Courses

Artificial Intelligence for Robotics
Stanford University via Udacity
Intro to Computer Science
University of Virginia via Udacity
Design of Computer Programs
Stanford University via Udacity
Web Development
Udacity
Programming Languages
University of Virginia via Udacity