YoVDO

Creating and Submitting PySpark Jobs to Spark Clusters

Offered By: CodeWithYu via YouTube

Tags

Apache Spark Courses Big Data Courses Java Courses Scala Courses Docker Courses Apache Airflow Courses PySpark Courses Data Engineering Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn to create and submit PySpark jobs to Spark clusters in this comprehensive tutorial. Dive into an end-to-end data engineering project combining Apache Airflow, Docker, Spark Clusters, Scala, Python, and Java. Create basic jobs using multiple programming languages, submit them to the Spark cluster for processing, and observe live results. Follow along as the instructor guides you through setting up a Spark cluster and Airflow on Docker, creating Spark jobs in Python, Scala, and Java, building and compiling Scala and Java jobs, and analyzing cluster computation results. Gain practical experience in big data processing and workflow automation, essential skills for aspiring data engineers.

Syllabus

Introduction
Creating The Spark Cluster and Airflow on Docker
Creating Spark Job with Python
Creating Spark Job with Scala
Building and Compiling Scala Jobs
Creating Spark Job with Java
Building and Compiling Java Jobs
Cluster computation results


Taught by

CodeWithYu

Related Courses

Functional Programming Principles in Scala
École Polytechnique Fédérale de Lausanne via Coursera
Functional Program Design in Scala
École Polytechnique Fédérale de Lausanne via Coursera
Parallel programming
École Polytechnique Fédérale de Lausanne via Coursera
Big Data Analysis with Scala and Spark
École Polytechnique Fédérale de Lausanne via Coursera
Functional Programming in Scala Capstone
École Polytechnique Fédérale de Lausanne via Coursera