YoVDO

Why I Love Kubernetes Failure Stories and You Should Too

Offered By: GOTO Conferences via YouTube

Tags

GOTO Conferences Courses DevOps Courses Kubernetes Courses Continuous Improvement Courses Cloud Infrastructure Courses Incident Management Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a senior principal engineer's insights on Kubernetes failure stories in this 33-minute conference talk from GOTO Berlin 2019. Dive into real-world experiences of operating over 100 clusters, uncovering valuable lessons from incidents, failures, and user reports. Learn why Kubernetes remains a sensible choice despite its perceived complexity, and gain practical knowledge on common pitfalls, best practices, and improvements in areas such as ingress errors, CoreDNS OOMKills, and API server issues. Discover the importance of proper resource management, monitoring, and automated testing in maintaining robust Kubernetes environments. Understand the benefits of sharing failure stories for continuous improvement and fostering collaboration across organizations in the Kubernetes ecosystem.

Syllabus

Intro
ZALANDO AT A GLANCE
2019: DEVELOPERS USING KUBERNETES
INGRESS ERRORS
COREDNS OOMKILL
STOP THE BLEEDING: INCREASE MEMORY LIMIT
INCREASE IN MEMORY USAGE
CONTRIBUTING FACTORS
CUSTOMER IMPACT
IAM RETURNING 404
NUMBER OF PODS
ROUTES FROM API SERVER
API SERVER DOWN
INNOCENT MANIFEST
INCIDENT #2: LESSONS LEARNED
CLUSTER DOWN?
THE TRIGGER
CLUSTER LIFECYCLE MANAGER (CLM)
CLUSTER CHANNELS
FLANNEL ERRORS
RBAC CHANGES
NETWORK SPLIT
CREDENTIALS QUEUE
WHAT HAPPENED
SLACK
DISABLING CPU THROTTLING
RACE CONDITIONS..
COMMON PITFALLS
READINESS & LIVENESS PROBES
RESOURCE REQUESTS & LIMITS
AWS EKS IN PRODUCTION
AUTOMATED E2E TESTS
MONITORING
OPENTRACING
UPGRADE TO KUBERNETES 1.14
EMERGENCY ACCESS SERVICE
KUBERNETES FAILURE STORIES
INTERNAL TICKETS BASED ON FAILURE STORIES
FACTFULNESS
WHY KUBERNETES?
COMPLEXITY FOR GOOGLE-SCALE INFRA?
OPEN SOURCE & MORE


Taught by

GOTO Conferences

Related Courses

Architecting Microsoft Azure Solutions
Microsoft via edX
Designing Highly Scalable Web Apps on Google Cloud Platform
Google via Coursera
Windows Server 2016: Azure for On-Premises Administrators
Microsoft via edX
Essential Google Cloud Infrastructure: Foundation
Google Cloud via Coursera
Unlock Your Digital Business with SAP HANA Enterprise Cloud
SAP Learning