Open-Sourcing Venice
Offered By: Strange Loop Conference via YouTube
Course Description
Overview
Explore LinkedIn's derived data storage system, Venice, in this conference talk from Strange Loop 2022. Discover how Venice provides high-throughput ingestion of data from batch and stream processing jobs while offering low latency online serving. Learn about its production usage, hosting ~1500 datasets that are rewritten daily and used for AI model inference workloads. Understand Venice's role in the "People you may know" feature, which performs online deep learning with millions of reads and computations per second. Examine how client applications can utilize Venice's data plane and APIs for both eager loading and network queries. Delve into Venice's architecture, designed for massive scale and operability, supporting self-healing, linear scalability, multi-tenancy, and multi-datacenter replication. Gain insights from Felix GV, Principal Staff Engineer at LinkedIn, as he shares his experience developing Venice from its inception to its current state as a crucial component of LinkedIn's data infrastructure.
Syllabus
Intro
Derived Data Store
Hybrid Workloads
Stream Processing
Streaming Writes
Partial Updates
Correctness
Scale
Single Get Use Case
Read Compute Use Case
Eager Cache
Da Vinci Use Cases
Scalability
What is Venice for?
Taught by
Strange Loop Conference
Tags
Related Courses
Sniffing the MetaverseStrange Loop Conference via YouTube KalDB - A Cloud Native Log Search Platform
Strange Loop Conference via YouTube The Evolution of a Planetary-scale Distributed Database
Strange Loop Conference via YouTube Machine Learning for Developer Productivity
Strange Loop Conference via YouTube Formally Verifying Everybody's Cryptography
Strange Loop Conference via YouTube