YoVDO

SRE and ML - Why It Matters

Offered By: USENIX via YouTube

Tags

SREcon Courses Machine Learning Courses Distributed Computing Courses

Course Description

Overview

Explore the intersection of Site Reliability Engineering (SRE) and Machine Learning (ML) in this 44-minute conference talk from SREcon22 EMEA. Delve into why ML matters for SREs, the challenges of ML reliability, and necessary adaptations for the SRE profession. Examine the current state of ML automation in production environments with a critical perspective. Learn about managing ML in production, the complexities of ML implementation, and the distinction between hype and reality in the field. Gain insights into ML Ops, model quality considerations, and the data-sensitive nature of ML. Conclude with future predictions and recommended further reading to stay ahead in the evolving landscape of SRE and ML integration.

Syllabus

Introduction
Lambda
Dave
TMU
ML does matter
Pause and breathe
Managing ML in production
How hard is ML
Hype and Reality
Gartner Hype Cycle
ML vs AI
ML Ops
Model Quality
ML is data sensitive
I can ML
The future
Future predictions
Future reading


Taught by

USENIX

Related Courses

How to Not Destroy Your Production Kubernetes Clusters
USENIX via YouTube
Knowledge and Power - A Sociotechnical Systems Discussion on the Future of SRE
USENIX via YouTube
Tracing Bare Metal with OpenTelemetry
USENIX via YouTube
Improving How We Observe Our Observability Data - Techniques for SREs
USENIX via YouTube
DO, RE, Me - Measuring the Effectiveness of Site Reliability Engineering
USENIX via YouTube