Techniques for SLOs and Error Budgets at Scale
Offered By: Conf42 via YouTube
Course Description
Overview
Explore techniques for implementing Service Level Objectives (SLOs) and Error Budgets at scale in this conference talk from Conf42 Observability 2023. Dive into the challenges of quantifying latency for large-scale systems, learn about the differences between measuring availability and latency, and discover key strategies for democratizing SLOs and error budgets across engineering teams. Gain insights on using raw histograms for accurate latency measurements, decomposing histogram modes, and implementing multi-service SLOs and error budgets. Perfect for engineers and managers looking to improve their observability practices and maintain high-quality service levels in complex, large-scale environments.
Syllabus
intro
preface
have you used this in your career? traffic for total.rrd
hi, i'm fred
how do you implement slos for 1000 engineers?
books
sli: good vs bad requests
slo: good/bad time_range
eb: 1-slo, 1-0.9995 = 0.05%
keys to slo / error budget democratization
latency and availability
measuring availability is easy, measuring latency is not easy
quantifying latency at scale
a common mistake
"dr. histogram - how i learned to stop worrying and love latency bands"
use raw histograms, avoid sketches & approximations
decomposing histogram modes
multi service slos / error budgets
thank you, questions?
Taught by
Conf42
Related Courses
Intro to Descriptive StatisticsSan Jose State University via Udacity Teaching Statistical Thinking: Part 1 Descriptive Statistics
Duke University via Coursera Introductory Statistics : Analyzing Data Using Graphs and Statistics
Seoul National University via edX 大学生のためのデータサイエンス(Ⅰ)(ga109)
Shiga University via gacco Exploratory Data Analysis with Seaborn
Coursera Project Network via Coursera