YoVDO

Building a Data Platform with Apache Spark on Kubernetes

Offered By: WeAreDevelopers via YouTube

Tags

WeAreDevelopers World Congress Courses Kubernetes Courses Apache Spark Courses Jupyter Notebooks Courses Data Analytics Courses Containerization Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the challenges and solutions of building a data platform using Apache Spark on Kubernetes in this 31-minute conference talk. Learn how PUBG Corporation migrated its on-demand data analytics platform to Spark on Kubernetes, serving millions of online gamers. Discover the Sphynx project, which manages on-demand Spark clusters and Jupyter Notebooks as containerized applications on Kubernetes. Gain insights into the main log pipeline, Apache Spark layer platform, batch systems, and data system domain. Understand Kubernetes deployment, scheduling, and platform architecture. Delve into workflows, best practices, monitoring strategies, and future work considerations. Walk away with key takeaways for implementing Spark on Kubernetes in large-scale data processing environments.

Syllabus

Introduction
Overview
Main Log Pipeline
Apache Spark
Layer Platform
Notebooks
Batch System
Spark Platform
Data System Domain
Problems
What is Kubernetes
Kubernetes Deployment
Kubernetes Scheduler
Platform Architecture
Workflow
Best Sauce
Challenges
Monitoring
Future Work
Key Takeaways
Questions


Taught by

WeAreDevelopers

Related Courses

Fundamentals of Containers, Kubernetes, and Red Hat OpenShift
Red Hat via edX
Configuration Management for Containerized Delivery
Microsoft via edX
Getting Started with Google Kubernetes Engine - Español
Google Cloud via Coursera
Getting Started with Google Kubernetes Engine - 日本語版
Google Cloud via Coursera
Architecting with Google Kubernetes Engine: Foundations en Español
Google Cloud via Coursera