Techniques for SLOs and Error Budgets at Scale
Offered By: Conf42 via YouTube
Course Description
Overview
Explore techniques for implementing Service Level Objectives (SLOs) and Error Budgets at scale in this conference talk from Conf42 Observability 2023. Dive into the challenges of quantifying latency for large-scale systems, learn about the differences between measuring availability and latency, and discover key strategies for democratizing SLOs and error budgets across engineering teams. Gain insights on using raw histograms for accurate latency measurements, decomposing histogram modes, and implementing multi-service SLOs and error budgets. Perfect for engineers and managers looking to improve their observability practices and maintain high-quality service levels in complex, large-scale environments.
Syllabus
intro
preface
have you used this in your career? traffic for total.rrd
hi, i'm fred
how do you implement slos for 1000 engineers?
books
sli: good vs bad requests
slo: good/bad time_range
eb: 1-slo, 1-0.9995 = 0.05%
keys to slo / error budget democratization
latency and availability
measuring availability is easy, measuring latency is not easy
quantifying latency at scale
a common mistake
"dr. histogram - how i learned to stop worrying and love latency bands"
use raw histograms, avoid sketches & approximations
decomposing histogram modes
multi service slos / error budgets
thank you, questions?
Taught by
Conf42
Related Courses
Введение в теорию кибернетических системSaint Petersburg State University via Coursera Dynamical System and Control
Indian Institute of Technology Roorkee via Swayam Kyma – A Flexible Way to Connect and Extend Applications
SAP Learning Linear Systems Theory
Indian Institute of Technology Madras via Swayam Introduction to DevOps and Site Reliability Engineering
Linux Foundation via edX