YoVDO

Earthquakes, Forest Fires, and Your Next Production Incident

Offered By: USENIX via YouTube

Tags

LISA (Large Installation System Administration) Conference Courses Emergency Response Courses

Course Description

Overview

Explore the Incident Command System's origins and applications in this 39-minute LISA19 conference talk. Discover how a tool developed for real-world emergencies like earthquakes and forest fires can be adapted for managing production incidents in computer systems. Learn about the system's evolution, successes, and failures, and gain insights into implementing its most effective aspects for handling your own operational emergencies. Delve into topics such as the impact of changes, the importance of research, key roles like Incident Commander and Operations Lead, and the significance of flexibility in incident response. Gain valuable knowledge on anticipating, training for, and testing incident management strategies to improve your team's readiness for the next production incident.

Syllabus

Intro
THE LAGUNA FIRE
CHANGES IMPACT EVERYTHING
NO GRAY WOLVES
WITH GRAY WOLVES
CHANGES EMIT CHANGES
RESPONDING AGENCIES MEET
FIRESCOPE
SERIOUS RESEARCH COMMENCES
AFTER LOTS OF RESEARCH
SUCCESS!
ALMOST NOTHING IS NEW
INCIDENT COMMANDER (IC)
OPERATIONS LEAD
COMMAND POST
COMMUNICATIONS LEAD
INCIDENT STATE DOCUMENTS
PLANNING LEAD
ICS AND FLEXIBILITY
IS THIS AN INCIDENT?
LOTS WENT WRONG
ANTICIPATE
TRAIN
TEST


Taught by

USENIX

Related Courses

Named Data Networking
USENIX via YouTube
Release Engineering Best Practices at Google
USENIX via YouTube
Efficiently Backing Up Terabytes of Data with PgBackRest
USENIX via YouTube
SRE in the Small and in the Large
USENIX via YouTube
Network-Based LUKS Volume Decryption with Tang
USENIX via YouTube