Elixir for Data Engineering - Batch and Stream Processing
Offered By: Databricks via YouTube
Course Description
Overview
Discover how to leverage Elixir for innovative data engineering solutions in this 24-minute conference talk from Databricks. Explore the power of Erlang's lightweight distributed process coordination for running worker clusters across Docker containers and performing data ingestion. Learn about a framework that integrates Elixir functions as steps in Airflow graphs. Dive into techniques for consuming and processing Kafka events directly within Elixir microservices. Examine real system examples with step-by-step walkthroughs of key elements, covering topics such as services architecture, scheduled job frameworks, and case studies on notification analytics. Gain insights into setting up Kafka consumers and implementing GenServer modules, all without requiring prior Erlang or Elixir knowledge.
Syllabus
Intro
Services architecture • Services configured as Erlang clusters with nodes. • Nodes deployed on containers • Nodes running the service will spawn Erlang processes
Scheduled Job framework written in El • Handles coordination of jobs across the Erlang nodies in a service rerunning failed jobs and persisting of status logs
Steps of a scheduled job workflow Airflow
Scheduled jobs in Application start()
Case study: Notification view analytic
Case study: Notification analysis
Setting up Kafka Ex 1. Add mix dependency to build
Supervisor module to listen on consur
GenServer consumer
Taught by
Databricks
Related Courses
Concurrent Programming in ErlangUniversity of Kent via FutureLearn Erlang
Exercism Functional Programming in Erlang
FutureLearn An Erlang Course
Independent Erlang master classes
Independent