YoVDO

Building a Multimodal Data Lakehouse with the Daft Distributed Python Dataframe

Offered By: Databricks via YouTube

Tags

Data Engineering Courses Python Courses Databricks Courses Distributed Computing Courses DataFrames Courses Delta Lake Courses ETL Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore how to build a multimodal data lakehouse using Daft, a next-generation distributed query engine, in this 18-minute conference talk. Learn about processing diverse data types including numbers, strings, JSONs, images, and PDFs at scale using a familiar dataframe interface. Discover how Daft simplifies large-scale ETL processes, eliminating the need for bespoke data pipelines and custom tooling. See a demonstration of integrating Daft with existing infrastructure like S3, DeltaLake, Databricks, and Spark to create a powerful and flexible data processing solution. Gain insights from Jay Chia, Co-Founder of Eventual Computing, on leveraging Daft's Python and Rust-based architecture for efficient multimodal data handling in modern data workloads.

Syllabus

Building a Multimodal Data Lakehouse with the Daft Distributed Python Dataframe


Taught by

Databricks

Related Courses

Julia Scientific Programming
University of Cape Town via Coursera
Spark
Udacity
AI Workflow: Enterprise Model Deployment
IBM via Coursera
Apache Spark with Scala - Hands On with Big Data!
Udemy
Taming Big Data with Apache Spark and Python - Hands On!
Udemy