YoVDO

Full Lifecycle Data Analysis on a Large-scale Leadership Supercomputer

Offered By: USENIX via YouTube

Tags

High Performance Computing Courses Data Analysis Courses System Architecture Courses Supercomputers Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive analysis of six years' worth of data from Sunway TaihuLight, the world's 11th-fastest supercomputer, in this 20-minute conference talk from USENIX ATC '24. Delve into the challenges faced by application developers and system administrators in managing complex supercomputer architectures. Gain valuable insights into operational management strategies for High-Performance Computing (HPC) systems, including issues like job hanging and starvation. Examine I/O workload characteristics, such as getattr operation spikes and massive file access patterns. Learn about the methodology, findings, and significance of this study, which analyzed 40 TB of I/O performance data and job running information from a supercomputer with 41,508 nodes. Discover potential applications of this research for future studies and practices in the HPC domain.

Syllabus

USENIX ATC '24 - Full Lifecycle Data Analysis on a Large-scale and Leadership Supercomputer...


Taught by

USENIX

Related Courses

SAP S/4HANA – Deep Dive
SAP Learning
Information Security- II
Indian Institute of Technology Madras via Swayam
Sistemas de gestión de la energía
Fundacion para la Eficiencia Energética via Independent
Базы данных (Databases)
Saint Petersburg State University via Coursera
Системное мышление
Moscow Institute of Physics and Technology via Coursera