Time for Chaos - Understanding Chaos Engineering for Infrastructure Resilience
Offered By: Google Cloud Tech via YouTube
Course Description
Overview
Explore Chaos Engineering, a branch of Site Reliability Engineering (SRE), in this 25-minute video presentation from Google Cloud Tech. Learn how to proactively test infrastructure resilience and reliability by simulating failures and injecting faults. Witness a demonstration using popular Chaos Engineering tools like Gremlin and Litmus on Google Kubernetes Engine. Discover fundamental aspects of Chaos Engineering platforms, including designing Chaos Workflows and simulating random pod deletion, network traffic degradation, and disk fill scenarios. Gain insights into the role of observability in Chaos Engineering and key metrics for determining application resiliency scores. Speaker Dharmesh Vaya guides you through the phases of Chaos, from establishing a steady state to formulating hypotheses and achieving end goals. Understand the average costs of infrastructure failures and best practices for implementing Chaos Engineering in your own systems.
Syllabus
Introduction
Average cost of Infra failures
Solution
Chaos Engineering
What is Chaos
Phases of Chaos
Steady State
Hypothesis
End Goal
Platforms
Experiment
Scenarios
Demo
Best Practices
Taught by
Google Cloud Tech
Related Courses
Service Mesh - Crash Course on ISTIO - Part 2Kode Kloud via YouTube Just Enough Istio to be Dangerous
Udemy Remoticon 2021 - Colin O'Flynn Zaps Chips and They Talk
Hackaday via YouTube FPGA Glitching & Side Channel Attacks
Hackaday via YouTube Can Applications Recover from fsync Failures?
USENIX via YouTube