YoVDO

Closed-Loop Network Performance Monitoring and Diagnosis with SpiderMon

Offered By: USENIX via YouTube

Tags

USENIX Symposium on Networked Systems Design and Implementation (NSDI) Courses Data Centers Courses

Course Description

Overview

Watch a 16-minute conference talk from USENIX NSDI '22 exploring SpiderMon, a novel system for closed-loop network performance monitoring and diagnosis. Learn how SpiderMon leverages "wait-for" relations to achieve low overhead and high coverage simultaneously, addressing limitations of existing query-driven and blanket monitoring approaches. Discover the system's key components, including the Always-on Monitor, Diagnosis Trigger, Timeout Filter, and Purple Traffic Meter. Understand how SpiderMon aligns telemetry data and utilizes Wait-For Graphs to accurately and quickly diagnose performance problems in data center networks. Gain insights into the system's evaluation, complexity, and potential impact on network management practices.

Syllabus

Introduction
Observations
Root causes
Existing solutions
SpiderMon
Alwayson Monitor
Diagnosis
Trigger
Timeout Filter
Timeout Filter Example
Purple Traffic Meter
Telemetry Data
Aligning Data
Wait For Graph
Calculate Degree
Evaluation
Complexity
Summary


Taught by

USENIX

Related Courses

Scaling Memcache at Facebook
USENIX via YouTube
Multi-Person Localization via RF Body Reflections
USENIX via YouTube
Opaque - An Oblivious and Encrypted Distributed Analytics Platform
USENIX via YouTube
Live Video Analytics at Scale with Approximation and Delay-Tolerance
USENIX via YouTube
Clipper - A Low-Latency Online Prediction Serving System
USENIX via YouTube