YoVDO

Thunderbolt - Throughput-Optimized, Quality-of-Service-Aware Power Capping at Scale

Offered By: USENIX via YouTube

Tags

OSDI (Operating Systems Design and Implementation) Courses Risk Assessment Courses System Architecture Courses Data Centers Courses

Course Description

Overview

Explore a conference talk on Thunderbolt, a hardware-agnostic power capping system designed for hyperscale data centers. Learn about the challenges of power oversubscription and the need for task-level quality-of-service differentiation in modern compute clusters. Discover how Thunderbolt ensures safe power oversubscription while minimizing impact on both throughput-oriented and latency-sensitive tasks. Examine the system's architecture, mechanisms, and policies, including its two-threshold control policy and use of CPU bandwidth control. Understand the benefits of Thunderbolt's reactive and proactive capping approaches, and see real-world deployment results in production clusters. Gain insights into power efficiency improvements and the potential for significant power oversubscription gains in data center environments.

Syllabus

Intro
Motivation: power oversubscription and capping
Motivation: task QoS differentiation
Prior industry solutions did not meet our needs
Architecture
Mechanism and policy details
Why not RAPL or DVFS?
CPU bandwidth control, DVFS, RAPL on Intel Skylake CPU
Reactive capping policy: load shaping
Load shaping on a production cluster
Proactive capping mechanism: CPU jailing Deterministic machine CPU cap
20% CPU jailing on a production cluster
Proactive capping policy: risk assessment
Deployed in logs processing clusters
Summary


Taught by

USENIX

Related Courses

Teaching Impacts of Technology: Fundamentals
University of California, San Diego via Coursera
Microsoft Azure Services and Concepts
Pluralsight
VirtualizaciĆ³n con VMware aplicada al mundo empresarial
Udemy
Cloud Deployment Options: Executive Briefing
Pluralsight
Designing Storage Networking for Cisco Data Center Infrastructure
Pluralsight