Ensuring Business Continuity with Embedded SRE Tools and Incident Management
Offered By: Platform Engineering via YouTube
Course Description
Overview
Explore a 15-minute conference talk from PlatformCon 2023 focused on ensuring business continuity through embedded Site Reliability Engineering (SRE) tools and incident management. Discover how MercadoLibre integrates SRE practices into their platform, including uptime modeling, architectural reviews, reliability checks, and critical application processes. Learn about the implementation of load testing tools, centralized alert management, runbooks, and comprehensive event auditing for effective troubleshooting. Gain insights into how these embedded elements contribute to a robust reliability engineering approach, reducing cognitive load and simplifying delivery. Understand the importance of high availability in platform engineering and its impact on both developers and end-users. Join Oscar Mullin, Sr. Director of Platform Core Services and SRE at MercadoLibre, as he shares valuable insights on promoting a culture of reliability, reducing downtime, and enhancing developer and end-user satisfaction through integrated SRE practices and tools.
Syllabus
Ensuring business continuity with embedded SRE tools and incident management | PlatformCon 2023
Taught by
Platform Engineering
Related Courses
Cybersecurity and Its Ten DomainsUniversity System of Georgia via Coursera Introduction to Data Storage and Management Technologies
IEEE via edX คลังข้อมูล (Data Warehouse)
Chiang Mai University via ThaiMOOC Managing Cybersecurity Incidents and Disasters
University System of Georgia via Coursera Ciberseguridad en linea
Udemy