Your Coflow has Many Flows - Sampling them for Fun and Speed
Offered By: USENIX via YouTube
Course Description
Overview
Explore a conference talk on improving coflow scheduling for enhanced data-intensive application performance. Learn about Philae, a novel online coflow scheduler that leverages the spatial dimension of coflows to reduce overhead in coflow size learning. Discover how this approach utilizes flow sampling to estimate average flow size and implements Shortest Coflow First scheduling. Examine the robustness of sampling-based learning to flow size skew and its scalability benefits. Analyze comparative performance results against prior art Aalo, showcasing significant reductions in coflow completion time across various testbed sizes and production cluster traces. Gain insights into the technical aspects of coflow scheduling, including challenges, practical issues, and comparisons with other approaches like Coda.
Syllabus
Introduction
Big Data Analytics
MapReduce
Communication Phase
Coflow Abstraction
Online Coflow Healing
Proposed Online Coflow
Outline
Example
Primary Drawbacks
Intrinsic Overhead
Roundrobin
Recap
Doubts about Sampling
Practical Issues
Valuation of Fillet
Fillet Speedup
Fillet Job Speed
Fillet Sensitivity
Summary
Mario Agassi
Practical Challenges
Comparison with Coda
Taught by
USENIX
Related Courses
Amazon DynamoDB - A Scalable, Predictably Performant, and Fully Managed NoSQL Database ServiceUSENIX via YouTube Faasm - Lightweight Isolation for Efficient Stateful Serverless Computing
USENIX via YouTube AC-Key - Adaptive Caching for LSM-based Key-Value Stores
USENIX via YouTube The Future of the Past - Challenges in Archival Storage
USENIX via YouTube A Decentralized Blockchain with High Throughput and Fast Confirmation
USENIX via YouTube