YoVDO

Monitoring the World - Scaling Thanos in Dynamic Prometheus Environments

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Prometheus Courses Distributed Systems Courses Scalability Courses Cloud Infrastructure Courses Infrastructure Management Courses Data Centers Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Cloudflare's journey in scaling Thanos for dynamic Prometheus environments in this 22-minute conference talk. Discover how Cloudflare implemented a single pane of glass to monitor their extensive Prometheus infrastructure across nearly 500 datacenters worldwide. Learn about the challenges faced and solutions developed for managing hundreds of geographically dispersed sidecars and collecting tens of billions of active time series. Gain insights into the tooling created for automatic management and scaling of infrastructure, including the creation and wiring of new buckets and sidecars, automatic sharding of stores as buckets grow, and utilization of spare CPU capacity for running compactors during non-peak hours. Understand the evolution of Cloudflare's monitoring system from a centralized OpenTSDB instance to a scalable Thanos-based solution, providing valuable lessons for organizations dealing with large-scale, globally distributed monitoring environments.

Syllabus

Monitoring the World: Scaling Thanos in Dynamic Prometheus Environments - Colin Douch, Cloudflare


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Cybersecurity Policy for Water and Electricity Infrastructures
University of Colorado System via Coursera
Continuous Delivery & DevOps
University of Virginia via Coursera
Preparing for your Professional Cloud Architect Journey
Google Cloud via Coursera
Infrastructure Planning and Managements
Indian Institute of Technology Madras via Swayam
Public Library Management
University of Michigan via edX