Time for Chaos - Understanding Chaos Engineering for Infrastructure Resilience
Offered By: Google Cloud Tech via YouTube
Course Description
Overview
Explore Chaos Engineering, a branch of Site Reliability Engineering (SRE), in this 25-minute video presentation from Google Cloud Tech. Learn how to proactively test infrastructure resilience and reliability by simulating failures and injecting faults. Witness a demonstration using popular Chaos Engineering tools like Gremlin and Litmus on Google Kubernetes Engine. Discover fundamental aspects of Chaos Engineering platforms, including designing Chaos Workflows and simulating random pod deletion, network traffic degradation, and disk fill scenarios. Gain insights into the role of observability in Chaos Engineering and key metrics for determining application resiliency scores. Speaker Dharmesh Vaya guides you through the phases of Chaos, from establishing a steady state to formulating hypotheses and achieving end goals. Understand the average costs of infrastructure failures and best practices for implementing Chaos Engineering in your own systems.
Syllabus
Introduction
Average cost of Infra failures
Solution
Chaos Engineering
What is Chaos
Phases of Chaos
Steady State
Hypothesis
End Goal
Platforms
Experiment
Scenarios
Demo
Best Practices
Taught by
Google Cloud Tech
Related Courses
Introduction to Cloud Infrastructure TechnologiesLinux Foundation via edX Scalable Microservices with Kubernetes
Google via Udacity Google Cloud Fundamentals: Core Infrastructure
Google via Coursera Introduction to Kubernetes
Linux Foundation via edX Fundamentals of Containers, Kubernetes, and Red Hat OpenShift
Red Hat via edX