Sublinear Scaling in Practice - The 1k SRE Project
Offered By: USENIX via YouTube
Course Description
Overview
Explore a conference talk from SREcon19 Americas that delves into Google's innovative approach to sublinear scaling in Site Reliability Engineering (SRE). Learn how one team dramatically increased their service portfolio by over 200% without additional staffing, aiming for a 1000-service goal. Discover the extensive automation infrastructure implemented, including automated incident handling and policy verification. Gain insights into the cultural shift from service-specific expertise to service-agnostic consulting, and understand the long-term vision for SRE in large organizations. Examine topics such as imperative and declarative automation, automatic continuous production readiness reviews, and the team's incremental progress towards achieving sublinear scaling.
Syllabus
Intro
About the Team
About Project Work
About Sublinear Scaling
Imperative Automation
Declarative Automation
Sequencer
Automations
Incremental progress
Automatic continuous production readiness reviews
Automated incident handling
Summary
Taught by
USENIX
Related Courses
How to Not Destroy Your Production Kubernetes ClustersUSENIX via YouTube SRE and ML - Why It Matters
USENIX via YouTube Knowledge and Power - A Sociotechnical Systems Discussion on the Future of SRE
USENIX via YouTube Tracing Bare Metal with OpenTelemetry
USENIX via YouTube Improving How We Observe Our Observability Data - Techniques for SREs
USENIX via YouTube