YoVDO

Diagnosing Application-Network Anomalies in Production Clouds

Offered By: USENIX via YouTube

Tags

Cloud Computing Courses Microservices Courses Alibaba Cloud Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a conference talk from USENIX ATC '24 that delves into diagnosing application-network anomalies in large-scale production cloud environments. Learn about the challenges faced in timely detection and diagnosis of anomalies, including the absence of impact assessment at the (micro)service level, low efficiency in anomaly routing, and the computational overheads of collecting fine-grained metrics. Discover the innovative Application-Network Diagnosing (AND) system implemented in Alibaba Cloud, which utilizes TCP retransmission metrics to capture anomalies and correlates applications with networks end-to-end. Gain insights into AND's core designs, including a collecting tool for filtering and statistics, a real-time detection procedure, and an anomaly routing model. Understand how this system has been successfully deployed for over three years, enabling minute-level anomaly detection and routing, as well as fast failure recovery in cloud environments.

Syllabus

USENIX ATC '24 - Diagnosing Application-network Anomalies for Millions of IPs in Production Clouds


Taught by

USENIX

Related Courses

Designing Applications for Kubernetes
A Cloud Guru
Docker - Deep Dive
A Cloud Guru
Amazon API Gateway for Serverless Applications
Amazon Web Services via AWS Skill Builder
Amazon API Gateway for Serverless Applications (Simplified Chinese)(中文配音版)
Amazon Web Services via AWS Skill Builder
Amazon API Gateway for Serverless Applications (Traditional Chinese)
Amazon Web Services via AWS Skill Builder