YoVDO

How Tracing Uncovers Half-truths in Slack's CI Infrastructure

Offered By: Strange Loop Conference via YouTube

Tags

Strange Loop Conference Courses

Course Description

Overview

Explore how tracing uncovers half-truths in Slack's CI infrastructure in this 23-minute conference talk from Strange Loop. Discover why traditional monitoring tools like logs and metrics were insufficient for debugging CI system failures. Learn how traces provided critical capabilities for understanding fault occurrences in interconnected systems such as GHE, Checkpoint, and Cypress. Gain insights into shared tooling for high-dimensionality event traces using SlackTrace and SpanEvents, and how they increased velocity in diagnosing code and debugging complex system interactions. Follow the journey from early incidents that motivated investment in internal tooling to improvements in performance and resiliency across Slack's infrastructure. Delve into topics including developer productivity, span event structure, shared dimensions, use cases, fuzzy service boundaries, incident command systems, and testing changes.

Syllabus

Intro
Developer Productivity
Span Event Structure
Whats Next
Shared Dimensions
Use Cases
The Root Challenge
The Results
Fuzzy Service Boundaries
Incident Command System
Testing Changes
Summary


Taught by

Strange Loop Conference

Tags

Related Courses

Sniffing the Metaverse
Strange Loop Conference via YouTube
KalDB - A Cloud Native Log Search Platform
Strange Loop Conference via YouTube
The Evolution of a Planetary-scale Distributed Database
Strange Loop Conference via YouTube
Machine Learning for Developer Productivity
Strange Loop Conference via YouTube
Formally Verifying Everybody's Cryptography
Strange Loop Conference via YouTube