Large Scale Batch Processing with Argo Workflows and Events
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore large-scale batch processing using Argo Workflows and Events in this 29-minute conference talk by Rakesh Subramanian Suresh and Saravanan Balasubramanian from Intuit. Discover how Intuit manages approximately 40,000 data processing pipelines with Argo Workflows, with 10% running concurrently daily. Learn about the event-driven approach using Argo Events to handle calendar events, upstream pipeline statuses, and data updates. Gain insights into scaling Argo Events to manage around 400,000 different events for pipeline workflows and achieving exactly-once triggering. Delve into the architecture of the Batch Processing Platform, multi-cluster scaling, high availability strategies, and pod distribution budgets. Understand the process of creating data processors and pipelines, as well as scheduling and managing data dependencies. Get a glimpse of future developments in this field.
Syllabus
Intro
Overview
Intuit Data Processing Flow
Batch Processing Platform Architecture
Intuit Batch Processing Flow
Scaling Argo Events
Scaling Argo Workflow in Multi Cluster
High Availability
Pod Distribution Budget (PDB)
Rate Limit for Concurrent POD
Creating Data Processor
Creating Data Pipeline
Scheduling & Managing Data Pipeline
A Cluster View of Batch Dependencies
What we are working on Next
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Introduction to Windows PowerShellMicrosoft via edX Windows PowerShell Basics
Microsoft via edX Preparing for Google Cloud Certification: Cloud Data Engineer
Google Cloud via Coursera Data Engineering on Google Cloud Platform en Français
Google Cloud via Coursera Data Engineering on Google Cloud Platform en Español
Google Cloud via Coursera