Delta Tensor: Efficient Vector and Tensor Storage in Delta Lake
Offered By: Databricks via YouTube
Course Description
Overview
Explore an innovative approach to storing and managing tensor data in Delta Lake through this 15-minute conference talk. Learn about Delta Tensor, a method that streamlines data loading processes and improves storage efficiency for machine learning workflows. Discover how chunking techniques reduce IO costs for tensor slicing and how sparse encoding methods enhance storage efficiency for sparse tensors. Gain insights into creating an efficient storage and management solution within a cloud-native Lakehouse environment. Presented by Zhiyu Wu, a student from Northeastern University, this talk offers valuable knowledge for data engineers and machine learning practitioners working with tensor data in cloud-based systems.
Syllabus
Delta Tensor: Efficient Vector and Tensor Storage in Delta Lake
Taught by
Databricks
Related Courses
内存数据库管理openHPI CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX Processing Big Data with Azure Data Lake Analytics
Microsoft via edX Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera