Scalable Anomaly Detection - With Zero Machine Learning
Offered By: Strange Loop Conference via YouTube
Course Description
Overview
Explore a comprehensive talk from the Strange Loop Conference on building a scalable anomaly detection system without using machine learning. Dive into Netflix's approach to detecting and pinpointing failures in their complex cloud architecture, composed of thousands of services and hundreds of thousands of VMs and containers. Learn how Zuul, Netflix's front-door for all cloud traffic, is leveraged to stream real-time events and identify broken paths in their microservices maze. Discover the innovative use of stream processing, anomaly detection algorithms, and a rules engine to create an efficient system capable of handling millions of requests across thousands of nodes. Gain insights into the benefits of using "old-fashioned math" over machine learning in certain scenarios, and understand the implementation of dynamic and adaptive thresholds. Examine the anomaly detection algorithm in-depth, including median estimation, MAD, and recovery detection. Explore the impact assessment process, data visualization techniques, and the use of Spinnaker events for more accurate problem identification. Understand how this system provides real-time alerting, reduces operational burden, and improves accuracy in detecting service issues within Netflix's complex microservices architecture.
Syllabus
Introduction
Netflixs Microservices
Time Series Database
Alerts Only for Zul
Dynamic Thresholds
Adaptive Thresholds
Anomaly Detector Raju
Results
No Machine Learning
How We Built It
Impact Graph
Context
Accuracy
Operational Burden
Realtime alerting
Realtime events
Mantis
How it works
Querying
Stream Processing
Aggregate
Job Chain
Requirements
Median estimation
Mad
Raju
Simple
Recovery Detection
Recovery Algorithm
What Raju Looks Like
Permutations
Data Visualization
Impact Assessment
Timeline of Events
What is API
Another needle in a haystack
Example
Gold Standard KPI
Spinnaker Events
Emailing the culprits
Benefits
Conclusion
Taught by
Strange Loop Conference
Tags
Related Courses
Apache Kafka Deep DiveA Cloud Guru Microsoft Certified: Azure Data Engineer Associate (DP-203)
A Cloud Guru Approfondimento sui concetti e gli strumenti per analizzare i dati in streaming (Italiano) | Deep Dive into Concepts and Tools for Analyzing Streaming Data (Italian)
Amazon Web Services via AWS Skill Builder Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera Apache Kafka
LearnKartS via Coursera