YoVDO

Sublinear Scaling in Practice - The 1k SRE Project

Offered By: USENIX via YouTube

Tags

SREcon Courses Site Reliability Engineering (SRE) Courses

Course Description

Overview

Explore a conference talk from SREcon19 Americas that delves into Google's innovative approach to sublinear scaling in Site Reliability Engineering (SRE). Learn how one team dramatically increased their service portfolio by over 200% without additional staffing, aiming for a 1000-service goal. Discover the extensive automation infrastructure implemented, including automated incident handling and policy verification. Gain insights into the cultural shift from service-specific expertise to service-agnostic consulting, and understand the long-term vision for SRE in large organizations. Examine topics such as imperative and declarative automation, automatic continuous production readiness reviews, and the team's incremental progress towards achieving sublinear scaling.

Syllabus

Intro
About the Team
About Project Work
About Sublinear Scaling
Imperative Automation
Declarative Automation
Sequencer
Automations
Incremental progress
Automatic continuous production readiness reviews
Automated incident handling
Summary


Taught by

USENIX

Related Courses

How to Not Destroy Your Production Kubernetes Clusters
USENIX via YouTube
SRE and ML - Why It Matters
USENIX via YouTube
Knowledge and Power - A Sociotechnical Systems Discussion on the Future of SRE
USENIX via YouTube
Tracing Bare Metal with OpenTelemetry
USENIX via YouTube
Improving How We Observe Our Observability Data - Techniques for SREs
USENIX via YouTube