YoVDO

Getting Started with Apache Spark on Kubernetes

Offered By: Databricks via YouTube

Tags

Apache Spark Courses Cloud Computing Courses Kubernetes Courses Data Processing Courses Cluster Management Courses Containerization Courses Data Pipelines Courses

Course Description

Overview

Discover how to leverage Apache Spark on Kubernetes in this 26-minute video from Databricks. Learn to build, deploy, and maintain end-to-end data pipelines using cloud-agnostic technology for improved isolation and resource sharing. Explore environment setup, application sizing, performance optimization, and monitoring techniques through code-heavy demonstrations and live examples on the Data Mechanics platform. Gain valuable insights for beginners and intermediate Spark developers to successfully implement Spark on Kubernetes, covering topics such as data access, node pools, pod sizes, dynamic allocation, disk and I/O optimizations, and application logs and metrics for debugging and reporting.

Syllabus

Introduction
Overview
Autopilot mode
Fully containerized
Architecture
Motivations
Monitoring
Cluster Setup
Demo
Whats Next


Taught by

Databricks

Related Courses

CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Big Data Analytics
University of Adelaide via edX
Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera