The DiRT on Chaos Engineering at Google
Offered By: GOTO Conferences via YouTube
Course Description
Overview
Dive into the world of Chaos Engineering at Google with this insightful conference talk from GOTO 2021. Explore 15 years of disaster resiliency testing (DiRT) as Jason Cahoon, a Site Reliability Engineer at Google, shares valuable lessons learned from thousands of production system tests. Gain a comprehensive understanding of why and what Google tests, including various testing themes and the balance between practical and theoretical approaches. Learn how to bootstrap a disaster testing program, address concerns about breaking production systems, and effectively report results. Discover key insights from Google's experience and examine specific test examples, such as running at service level, toggling discriminators, operating without dependencies, and simulating hacks. Whether you're new to Chaos Engineering or looking to enhance your existing practices, this talk provides essential knowledge for building more resilient systems.
Syllabus
Intro
DiRT: Disaster Resiliency Testing
Why?
What we test?
Testing themes
Practical vs theoretical
How?
Picking what to test
Steps for bootstrapping a disaster testing program
Testing production vs testin in production
Really, you're breaking production though?!
Reporting on results
What have we learned?
Test example: Run at service level
Test example: Toggle the O-N / O-F-F discriminator
Test example: Run without dependencies
Test example: Hacked!
Taught by
GOTO Conferences
Related Courses
Addressing Algorithmic BiasGOTO Conferences via YouTube Empowering Consumers - Evolution of Software in the Future
GOTO Conferences via YouTube Why Static Typing Came Back
GOTO Conferences via YouTube Higher Kinded Types in a Lower Kinded Language - Functional Programming in Kotlin
GOTO Conferences via YouTube It's Not Hard to Test Smart - Delivering Customer Value Faster
GOTO Conferences via YouTube