YoVDO

SRE and DevOps at a Startup

Offered By: USENIX via YouTube

Tags

LISA (Large Installation System Administration) Conference Courses DevOps Courses Incident Response Courses Capacity Planning Courses Team Management Courses

Course Description

Overview

Explore a 26-minute conference talk from LISA18 that delves into implementing Site Reliability Engineering (SRE) and DevOps principles in startup environments. Learn how Craig Sebenik from Split compares centralized support teams versus distributed teams that include developers. Discover the challenges of applying Google's SRE model to smaller-scale operations, and gain insights into the lessons learned and potential pitfalls when implementing either approach. Understand the differences between pure DevOps and specialist roles, hiring considerations, and the SRE hierarchy of reliability. Cover essential topics such as metrics and monitoring, incident response, release management, and capacity planning in the context of startups with limited resources.

Syllabus

Intro
SRE (and DevOps) at a Startup
My Background
What is Split?
Overview
What is SRE?
DevOps is Not a Job Title
Life at a Startup
Lots of Pieces
Developers Have Product Focus
Specialist (aka SRE)
Pure DevOps vs Specialist
Who To Hire
SRE Hierarchy of Reliability
Metrics and Monitoring
Incident Response
Release
Capacity Planning
Summary • SRE is an implementation of the DevOps paradigm.
Questions?


Taught by

USENIX

Related Courses

Named Data Networking
USENIX via YouTube
Release Engineering Best Practices at Google
USENIX via YouTube
Efficiently Backing Up Terabytes of Data with PgBackRest
USENIX via YouTube
SRE in the Small and in the Large
USENIX via YouTube
Network-Based LUKS Volume Decryption with Tang
USENIX via YouTube