Turning an Incident Report into a Design Issue with TLA+
Offered By: USENIX via YouTube
Course Description
Overview
Explore how modeling-driven techniques can enhance postmortem analysis of high-impact outages in this 31-minute conference talk from SREcon23 Americas. Learn about the application of TLA+, a formal specification language, to precisely describe and analyze the behavior of micro-service architectures under concurrency. Discover how building a specification of Microsoft's CosmosDB helped identify the root cause of a long-lasting outage, uncovered bugs in documentation, and provided insights into underlying design issues. Gain valuable knowledge on using TLA+ to create precise and reusable specifications, potentially preventing similar incidents in the future and improving overall system reliability.
Syllabus
SREcon23 Americas - Turning an Incident Report into a Design Issue with TLA+
Taught by
USENIX
Related Courses
Incident Detection and Response: The Big PicturePluralsight Integrated safety, health and environmental management: An introduction
The Open University via OpenLearn Threat Intel Analysis of Ukrainians Power Grid Hack
YouTube A Year in the Wild - Fighting Malware at the Corporate Level
Security BSides San Francisco via YouTube Tales from the VOID - The Scary Truth about Incident Metrics
USENIX via YouTube