YoVDO

Your Coflow has Many Flows - Sampling them for Fun and Speed

Offered By: USENIX via YouTube

Tags

USENIX Annual Technical Conference Courses MapReduce Courses Big Data Analytics Courses

Course Description

Overview

Explore a conference talk on improving coflow scheduling for enhanced data-intensive application performance. Learn about Philae, a novel online coflow scheduler that leverages the spatial dimension of coflows to reduce overhead in coflow size learning. Discover how this approach utilizes flow sampling to estimate average flow size and implements Shortest Coflow First scheduling. Examine the robustness of sampling-based learning to flow size skew and its scalability benefits. Analyze comparative performance results against prior art Aalo, showcasing significant reductions in coflow completion time across various testbed sizes and production cluster traces. Gain insights into the technical aspects of coflow scheduling, including challenges, practical issues, and comparisons with other approaches like Coda.

Syllabus

Introduction
Big Data Analytics
MapReduce
Communication Phase
Coflow Abstraction
Online Coflow Healing
Proposed Online Coflow
Outline
Example
Primary Drawbacks
Intrinsic Overhead
Roundrobin
Recap
Doubts about Sampling
Practical Issues
Valuation of Fillet
Fillet Speedup
Fillet Job Speed
Fillet Sensitivity
Summary
Mario Agassi
Practical Challenges
Comparison with Coda


Taught by

USENIX

Related Courses

Big Data Analytics in Healthcare
Georgia Institute of Technology via Udacity
Mining Massive Datasets
Stanford University via edX
The Caltech-JPL Summer School on Big Data Analytics
California Institute of Technology via Coursera
Big Data Analytics for Healthcare
Georgia Institute of Technology via Coursera
Data Lakes for Big Data
EdCast