YoVDO

Spark on Kubernetes - Best Practice and Performance

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Conference Talks Courses Kubernetes Courses High Availability Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore best practices and performance optimization techniques for running Apache Spark on Kubernetes in this 39-minute conference talk by Junjie Chen and Jerry Shao from Tencent. Learn about deploying Spark as a public cloud service using Kubernetes, covering topics such as authorization, logging, and multi-tenancy management. Discover performance tuning strategies for maximizing resource utilization, including detailed configuration adjustments for both Kubernetes and Spark. Gain insights into achieving high availability through Zookeeper integration and understand the performance impact of various configurations using TPC-DS workload benchmarks. Delve into the architecture, applications, storage services, and environments involved in Spark on Kubernetes deployments, and benefit from the speakers' real-world experiences and practical advice for optimizing big data services on containerized platforms.

Syllabus

Introduction
What is Spark
Why do we need Kubernetes
Architecture
Spark Application
Spark on accumulated status
Applications
Storage
Service
Structure
HDFS
Catalog
Highs
Environments
Benchmark Configuration
Benchmark Results
Data Locality
Our Experience
Summary


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

Optimizing Microsoft Windows Server Storage
Microsoft via edX
High Availability and Disaster Recovery with the SAP HANA Platform
SAP Learning
Microsoft Exchange Server 2016 - 3: Mailbox Databases
Microsoft via edX
Microsoft SharePoint 2016: Workload Optimization
Microsoft via edX
Microsoft Azure Virtual Machines
Microsoft via edX