Delta Tensor: Efficient Vector and Tensor Storage in Delta Lake
Offered By: Databricks via YouTube
Course Description
Overview
Explore an innovative approach to storing and managing tensor data in Delta Lake through this 15-minute conference talk. Learn about Delta Tensor, a method that streamlines data loading processes and improves storage efficiency for machine learning workflows. Discover how chunking techniques reduce IO costs for tensor slicing and how sparse encoding methods enhance storage efficiency for sparse tensors. Gain insights into creating an efficient storage and management solution within a cloud-native Lakehouse environment. Presented by Zhiyu Wu, a student from Northeastern University, this talk offers valuable knowledge for data engineers and machine learning practitioners working with tensor data in cloud-based systems.
Syllabus
Delta Tensor: Efficient Vector and Tensor Storage in Delta Lake
Taught by
Databricks
Related Courses
Distributed Computing with Spark SQLUniversity of California, Davis via Coursera Apache Spark (TM) SQL for Data Analysts
Databricks via Coursera Building Your First ETL Pipeline Using Azure Databricks
Pluralsight Implement a data lakehouse analytics solution with Azure Databricks
Microsoft via Microsoft Learn Perform data science with Azure Databricks
Microsoft via Microsoft Learn