How to Take Prometheus Planet Scale - Massively Large Scale Metrics Installations
Offered By: USENIX via YouTube
Course Description
Overview
Explore how eBay scaled its observability infrastructure to handle massive volumes of time series data in this conference talk from SREcon23 Americas. Learn about the evolution of eBay's metrics installation from a 2 million per second ingest rate in 2017 to 40 million per second with nearly 3 billion active time series. Discover the challenges faced with large centralized clusters and degrading query latencies as cardinality grew. Examine how insights from Google's Monarch paper inspired a decentralized, planet-scale approach using Prometheus TSDB. Delve into the development of field hint indices for efficient query fanout, functional query push down techniques, and the implementation of GitOps for managing globally distributed deployments. Gain valuable lessons on scaling Prometheus to handle planetary-scale metrics installations and the practical challenges of operating such systems.
Syllabus
SREcon23 Americas - How To Take Prometheus Planet Scale: Massively Large Scale Metrics Installations
Taught by
USENIX
Related Courses
Introduction to Cloud Infrastructure TechnologiesLinux Foundation via edX Scalable Microservices with Kubernetes
Google via Udacity Google Cloud Fundamentals: Core Infrastructure
Google via Coursera Introduction to Kubernetes
Linux Foundation via edX Fundamentals of Containers, Kubernetes, and Red Hat OpenShift
Red Hat via edX