Faster, Safer, Serverless - Empowering Apache Spark Standalone Cluster on Kubernetes
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore a cutting-edge approach to running Apache Spark on Kubernetes in this conference talk. Learn how to overcome the challenges of prolonged startup times in quick data analysis scenarios using Spark SQL. Discover a truly Kubernetes-native Serverless Spark Service that prioritizes speed and simplicity through a new K8s operator for standalone cluster creation and job submission. Understand how this solution leverages Kubernetes' elastic and policy management capabilities, including K8S metrics server, HPA, and Kyverno, to streamline workflows for Apache Spark, infrastructure engineers, and users. Gain insights into achieving rapid responsiveness (under 4 seconds) and integrating longevity ML training frameworks. Delve into the future of Apache Spark, where Kubernetes serves as the core, enabling unparalleled efficiency and responsiveness in data processing and analysis.
Syllabus
Faster, Safer, Serverless - Empowering Apache Spark Standalone Cluster on Kubernetes - Huichao Zhao
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera