Effective Disaster Recovery - The Day We Deleted Production
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore a real-world disaster recovery scenario in this 37-minute conference talk from KubeCon + CloudNativeCon. Learn how InfluxData accidentally deleted all compute from a busy production cluster, causing a multi-hour outage. Discover the events leading up to the incident, the recovery process, customer reactions, and implemented changes. Gain insights into CI/CD pipeline configurations and the specific change that triggered the outage. Examine the effectiveness of their disaster recovery plan, identifying successful elements and areas for improvement. Benefit from a blend of technical and management perspectives on handling critical infrastructure failures and implementing robust disaster recovery strategies.
Syllabus
Effective Disaster Recovery: The Day We Deleted Production - Rick Spencer & Wojciech Kocjan
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Emergency ManagementOpen2Study Resilience in Children Exposed to Trauma, Disaster and War: Global Perspectives
University of Minnesota via Coursera MongoDB Advanced Deployment and Operations
MongoDB University Arch403: Designing Resilient Schools
Build Academy via EdCast Bases de données relationnelles : Comprendre pour maîtriser
Inria (French Institute for Research in Computer Science and Automation) via France Université Numerique