YoVDO

A Field Guide to Reliability Engineering at Zalando - From Small to Large Scale

Offered By: GOTO Conferences via YouTube

Tags

Reliability Engineering Courses DevOps Courses Risk Management Courses Incident Management Courses Observability Courses Service-Level Objectives Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Zalando's approach to reliability engineering in this comprehensive conference talk from GOTO Amsterdam 2024. Dive into best practices for achieving high reliability at scale, from stand-alone applications to company-wide systems. Learn about instrumentation, monitoring, alerting, tracing, and incident management techniques. Discover how Zalando manages reliability across 3000+ applications and 2000+ engineers to serve 50M+ customers in 23 countries. Gain insights on effective technologies and processes like WORM Cascades and Risk Management for steering reliability at the enterprise level. Follow along as Heinrich Hartmann, Head of Reliability Engineering at Zalando, shares valuable lessons on balancing technological and human factors in building robust, scalable systems.

Syllabus

Intro
Agenda
Principles
Context
Operations at Zalando
Alerting
Dashboards
Observability
SLOs
Incident process
WORMs
Summary
Outro


Taught by

GOTO Conferences

Related Courses

Введение в теорию кибернетических систем
Saint Petersburg State University via Coursera
Dynamical System and Control
Indian Institute of Technology Roorkee via Swayam
Kyma – A Flexible Way to Connect and Extend Applications
SAP Learning
Linear Systems Theory
Indian Institute of Technology Madras via Swayam
Introduction to DevOps and Site Reliability Engineering
Linux Foundation via edX