YoVDO

ETL Pipeline to Achieve Reliability at Scale

Offered By: EuroPython Conference via YouTube

Tags

EuroPython Courses Fault Tolerance Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the design and implementation of a scalable ETL pipeline for handling high-volume financial transactions in an online betting exchange. Learn how to achieve reliability and accuracy in daily accounting reports while addressing challenges such as fault tolerance, fast data retrieval, and efficient computations. Discover the motivations behind the chosen tech stack, including Python3, Luigi, and Spark, and gain insights into solving key technical problems such as identifying and rerunning faulty steps, optimizing input/output operations, and enhancing computational speed. Delve into Spark's key concepts, execution processes, and integration with Luigi, as well as running Spark jobs on Amazon EMR for improved performance and scalability.

Syllabus

Intro
Accounting at markets
Fault tolerance and reliability
Efficient storage
Good performance
Spark key concepts
Execution on Spark
Spark job from Luigi
Spark on EMR
Shutdown EMR cluster


Taught by

EuroPython Conference

Related Courses

MongoDB for DBAs
MongoDB University
MongoDB Advanced Deployment and Operations
MongoDB University
Building Cloud Apps with Microsoft Azure - Part 3
Microsoft via edX
Implementing Microsoft Windows Server Disks and Volumes
Microsoft via edX
Cloud Computing and Distributed Systems
Indian Institute of Technology Patna via Swayam