YoVDO

Why We Built Our Own Distributed Column Store

Offered By: Strange Loop Conference via YouTube

Tags

Strange Loop Conference Courses Distributed Systems Courses Software Engineering Courses Fault Tolerance Courses Data Ingestion Courses Database Architecture Courses

Course Description

Overview

Explore the architecture and implementation of Retriever, a custom-built distributed column store database, in this 43-minute Strange Loop Conference talk. Learn how Honeycomb addressed the challenges of understanding complex distributed systems in production by developing a low-latency, schemaless database inspired by Facebook's Scuba. Discover the design decisions behind Retriever, including its use of disk storage, efficient column-oriented storage model, and ability to handle multi-tenancy and cost constraints. Gain insights into the write and read paths, data model, storage format, distributed queries, and fault tolerance mechanisms. Understand how Retriever ingests events from Kafka, manages quotas, and handles failure recovery. Delve into the lessons learned from operating a hand-rolled database at production scale with paying customers, and see how it compares to other solutions for sub-second complex queries over large data volumes in real time.

Syllabus

Intro
Please meet Retriever
Retriever is a special purpose data store
What is Honeycomb?
How Honeycomb works
Honeycomb under the hood
Our requirements
Requirements - summary
Retriever at a glance
Retriever compared to Scuba
Architecture - write path
Architecture - read path
Data model - datasets
Data model - events
Row oriented storage
Column oriented storage
Storage Format - timestamp column
Storage Format - reading
Distributed queries
Distributed reads - calculations
Distributed reads - fanout
Detour - Kafka
Ingestion
Quota management
Fault tolerance
Failure recovery
Bootstrapping new nodes


Taught by

Strange Loop Conference

Tags

Related Courses

MongoDB for DBAs
MongoDB University
MongoDB Advanced Deployment and Operations
MongoDB University
Building Cloud Apps with Microsoft Azure - Part 3
Microsoft via edX
Implementing Microsoft Windows Server Disks and Volumes
Microsoft via edX
Cloud Computing and Distributed Systems
Indian Institute of Technology Patna via Swayam