DeepSketch - A New Machine Learning-Based Reference Search Technique for Post-Deduplication Delta Compression
Offered By: USENIX via YouTube
Course Description
Overview
Syllabus
Intro
Executive Summary
Data Reduction in Storage Systems
Post-deduplication Delta Compression Combines three different data-reduction approaches
Overview of Post-Deduplication Delta Compression
Lossless Compression
Key Challenge: Reference Search How to find a good reference block for an incoming data block across a wide range of stored data at low cost
Limitations of Existing Techniques - Provide significantly lower data-reduction ratios than the optimal
DeepSketch: Key Idea Use the learning-to-hash method for sketch generation A promising machine learning (ML).-based approach for the
DeepSketch: Challenges Lack of semantic information
Data Clustering for DeepSketch . Existing clustering algorithms are unsuitable for DeepSketch
Post-Processing for Training Data Set Non-uniform distribution of data blocks across the clusters
Evaluation Methodology Compared data-reduction techniques
Overall Data-Reduction Benefits
Performance Overhead
Taught by
USENIX
Related Courses
Understanding the Robustness of SSDs under Power FaultUSENIX via YouTube BetrFS - A Right-Optimized Write-Optimized File System
USENIX via YouTube F2FS - A New File System for Flash Storage
USENIX via YouTube DNA Data Storage and Near-Molecule Processing for the Yottabyte Era
USENIX via YouTube FAST '21 Work-in-Progress Reports
USENIX via YouTube