YoVDO

Tectonic-Shift - A Composite Storage Fabric for Large-Scale ML Training

Offered By: USENIX via YouTube

Tags

USENIX Annual Technical Conference Courses Machine Learning Courses Cloud Computing Courses Software Design Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 20-minute conference talk from USENIX ATC '23 detailing Tectonic-Shift, a composite storage fabric designed for large-scale machine learning training at Meta. Discover how this innovative system addresses the challenges of meeting intensive IO and high-capacity storage demands in industrial ML environments. Learn about the workload characterization process that informed the hardware and software design, and understand the principles behind combining Shift, a flash storage tier, with Tectonic to maximize storage power efficiency. Gain insights into novel application-aware cache policies that infer future access patterns from training dataset specifications, resulting in 1.51-3.28x more IO absorption than traditional LRU flash caches. Understand how Tectonic-Shift achieves a 29% reduction in power demand for petabyte-scale production clusters, paving the way for more scalable and efficient ML training infrastructures.

Syllabus

USENIX ATC '23 - Tectonic-Shift: A Composite Storage Fabric for Large-Scale ML Training


Taught by

USENIX

Related Courses

Amazon DynamoDB - A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service
USENIX via YouTube
Faasm - Lightweight Isolation for Efficient Stateful Serverless Computing
USENIX via YouTube
AC-Key - Adaptive Caching for LSM-based Key-Value Stores
USENIX via YouTube
The Future of the Past - Challenges in Archival Storage
USENIX via YouTube
A Decentralized Blockchain with High Throughput and Fast Confirmation
USENIX via YouTube