End-to-End Data Engineering with Apache Airflow, Docker, and Spark Clusters - Using Python, Scala, and Java
Offered By: CodeWithYu via YouTube
Course Description
Overview
Learn to set up and utilize Apache Airflow and Spark Clusters on Docker in this comprehensive video tutorial. Create an end-to-end data engineering project combining Apache Airflow, Docker, Spark Clusters, Scala, Python, and Java. Develop basic jobs using multiple programming languages, submit them to the Spark cluster for processing, and observe live results. Follow along as the instructor guides you through creating Spark jobs with Python, Scala, and Java, as well as building and compiling Scala and Java jobs. Gain hands-on experience in cluster computation and workflow automation, essential skills for big data analytics and data engineering projects.
Syllabus
Introduction
Creating The Spark Cluster and Airflow on Docker
Creating Spark Job with Python
Creating Spark Job with Scala
Building and Compiling Scala Jobs
Creating Spark Job with Java
Building and Compiling Java Jobs
Cluster computation results
Taught by
CodeWithYu
Related Courses
Cloud Computing Applications, Part 1: Cloud Systems and InfrastructureUniversity of Illinois at Urbana-Champaign via Coursera Introduction to Cloud Infrastructure Technologies
Linux Foundation via edX Introduction aux conteneurs
Microsoft Virtual Academy via OpenClassrooms The Docker for DevOps course: From development to production
Udemy Windows Server 2016: Virtualization
Microsoft via edX