SRE in the Small and in the Large

Offered By: USENIX via YouTube

Course Description

Overview

Explore a comprehensive conference talk from LISA16 that delves into the principles of Site Reliability Engineering (SRE) and their applicability to organizations of all sizes. Discover how SRE practices, often associated with large-scale systems engineering, can be effectively implemented in both small startups and major corporations like Google. Learn about key SRE concepts, including load testing, application state management, monitoring, dependency management, sharding, and distributed applications. Gain insights into common objections to SRE implementation and understand the trade-offs involved. Examine real-world examples and case studies that illustrate the practical application of SRE principles across various scenarios. Uncover little-known facts about SRE and explore arguments for and against microservices architecture in the context of reliability engineering.

Syllabus

Introduction
Agenda
Introductions
SRE vs Software Engineering
SRE is a new way of engineering
What does SRE do
SRE in large companies
We dont have infinite chocolate
Success in Google
The SRE is doomed
Not all companies are doing SRE
Mini Europe
Story Time
Story Time 2
Pivot to the General
Load Tests
Export Application State
Monitor
Dependencies
Sharding
Distributed Applications
Little Known Fact
Most General Objection
Tradeoff
Exporting Application State
Debugging Without Application State
Bad Monitoring
Kitty and Bear
Dependency Testing
Stack Overflow
Precious Servers
Pack Sauce
Distributed Consensus
Identifiers
Microservices
Arguments against microservices

Taught by

USENIX

SRE in the Small and in the Large

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

SRE in the Small and in the Large

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue