YoVDO

Fast Recovery of Container Failures at Large Scale - The 1-5-10 Approach

Offered By: Linux Foundation via YouTube

Tags

Container Orchestration Courses DevOps Courses Cloud Computing Courses Microservices Courses Incident Response Courses Reliability Engineering Courses Scalability Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Alibaba's innovative 1-5-10 theory for fast container failure recovery at scale in this informative conference talk. Delve into the challenges of maintaining container reliability in the cloud era as applications grow rapidly. Learn how to detect problems within 1 minute, identify issues within 5 minutes, and resolve failures within 10 minutes. Discover techniques for building an efficient local agent for quick problem detection, implementing intelligent diagnostics using expert knowledge bases, and automating container problem recovery through a failure-driven approach. Gain valuable insights into increasing the reliability of large-scale container deployments without increasing resource investment.

Syllabus

1-5-10: How to Fast Recover Container Failure at Large Scale - XiongHuan, Alibaba


Taught by

Linux Foundation

Tags

Related Courses

Failure Analysis And Prevention
Indian Institute of Technology Roorkee via Swayam
Reliable Cloud Infrastructure: Design and Process en Français
Google Cloud via Coursera
Reliability in Engineering Design
Purdue University via edX
Reliable Google Cloud Infrastructure: Design and Process
Pluralsight
Reliable Google Cloud Infrastructure: Design and Process
Pluralsight