YoVDO

Spark on Kubernetes - Best Practice and Performance

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Conference Talks Courses Kubernetes Courses High Availability Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore best practices and performance optimization techniques for running Apache Spark on Kubernetes in this 39-minute conference talk by Junjie Chen and Jerry Shao from Tencent. Learn about deploying Spark as a public cloud service using Kubernetes, covering topics such as authorization, logging, and multi-tenancy management. Discover performance tuning strategies for maximizing resource utilization, including detailed configuration adjustments for both Kubernetes and Spark. Gain insights into achieving high availability through Zookeeper integration and understand the performance impact of various configurations using TPC-DS workload benchmarks. Delve into the architecture, applications, storage services, and environments involved in Spark on Kubernetes deployments, and benefit from the speakers' real-world experiences and practical advice for optimizing big data services on containerized platforms.

Syllabus

Introduction
What is Spark
Why do we need Kubernetes
Architecture
Spark Application
Spark on accumulated status
Applications
Storage
Service
Structure
HDFS
Catalog
Highs
Environments
Benchmark Configuration
Benchmark Results
Data Locality
Our Experience
Summary


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Building Geospatial Apps on Postgres, PostGIS, & Citus at Large Scale
Microsoft via YouTube
Unlocking the Power of ML for Your JavaScript Applications with TensorFlow.js
TensorFlow via YouTube
Managing the Reactive World with RxJava - Jake Wharton
ChariotSolutions via YouTube
What's New in Grails 2.0
ChariotSolutions via YouTube
Performance Analysis of Apache Spark and Presto in Cloud Environments
Databricks via YouTube