Getting Started with Apache Spark on Kubernetes
Offered By: Databricks via YouTube
Course Description
Overview
Discover how to leverage Apache Spark on Kubernetes in this 26-minute video from Databricks. Learn to build, deploy, and maintain end-to-end data pipelines using cloud-agnostic technology for improved isolation and resource sharing. Explore environment setup, application sizing, performance optimization, and monitoring techniques through code-heavy demonstrations and live examples on the Data Mechanics platform. Gain valuable insights for beginners and intermediate Spark developers to successfully implement Spark on Kubernetes, covering topics such as data access, node pools, pod sizes, dynamic allocation, disk and I/O optimizations, and application logs and metrics for debugging and reporting.
Syllabus
Introduction
Overview
Autopilot mode
Fully containerized
Architecture
Motivations
Monitoring
Cluster Setup
Demo
Whats Next
Taught by
Databricks
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera