Fault Tolerance Courses
Simons Institute via YouTube Quantum Neural Networks: Design and Training for Quantum Learning Tasks
Simons Institute via YouTube Node.js Microservices: Resilience and Fault Tolerance
Pluralsight Characterization of Large Language Model Development in Datacenters
USENIX via YouTube Accelerating Skewed Workloads with Performance Multipliers in TurboDB Distributed Database
USENIX via YouTube Alea-BFT: Practical Asynchronous Byzantine Fault Tolerance
USENIX via YouTube SwiftPaxos - Fast Geo-Replicated State Machines
USENIX via YouTube MegaScale - Scaling Large Language Model Training to More Than 10,000 GPUs
USENIX via YouTube LoLKV - Logless, Linearizable, RDMA-based Key-Value Storage System
USENIX via YouTube The Bedrock of Byzantine Fault Tolerance: A Unified Platform for BFT Protocols - NSDI '24
USENIX via YouTube