Characterizing and Understanding Phases of SRE Practices
Offered By: USENIX via YouTube
Course Description
Overview
Explore the evolution of Site Reliability Engineering (SRE) practices in this 33-minute conference talk from SREcon18 Asia/Australia. Delve into the journey of SRE implementation, examining various stages of skill acquisition and key practices. Learn about important signposts in areas such as incident prevention and handling, postmortems, KPI/SLOs, monitoring, and capacity management. Gain insights to evaluate your organization's current position on the SRE spectrum and plan for future advancements. Understand the concept of SRE as an ongoing journey, with practical examples and detailed examinations of exemplar values and practices. Use this knowledge to assess and improve your team's approach to site reliability, regardless of your background or company's current level of SRE implementation.
Syllabus
STAGES OF PRACTICE
Shu Signposts: Incident Response
Ha-Ri Signposts: Incident Response
Shu Signposts: Postmortems
Ha-Ri Signposts: Postmortems
Shu Signposts: Incident Prevention
Ha-Ri Signposts: Incident Prevention
Shu Signposts: Monitoring
Ha-Ri Signposts: Monitoring
Shu Signposts: Performance Management
Ha-Ri Signposts: Capacity Planning and Forecasting
Assessing Your Organization's Level of Practice
Each 9' will cost you more than the one before it...
THE ADVANCED COMPUTING SYSTEMS ASSOCIATION
Taught by
USENIX
Related Courses
How to Not Destroy Your Production Kubernetes ClustersUSENIX via YouTube SRE and ML - Why It Matters
USENIX via YouTube Knowledge and Power - A Sociotechnical Systems Discussion on the Future of SRE
USENIX via YouTube Tracing Bare Metal with OpenTelemetry
USENIX via YouTube Improving How We Observe Our Observability Data - Techniques for SREs
USENIX via YouTube