YoVDO

Efficient Scheduling of High Performance Batch Computing for Analytics Workloads with Volcano

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Kubernetes Courses Apache Spark Courses Jupyter Notebooks Courses Data Analytics Courses Volcano Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore how ING Wholesale Banking Advanced Analytics team implemented efficient scheduling for high-performance batch computing in analytics workloads using Volcano. Discover the journey of creating a centralized platform for internal data sources and large-scale computing, enabling over 300 internal projects and 2000 users to access advanced analytics capabilities. Learn about the implementation of a specialized cloud-native Kubernetes scheduler, Volcano, to optimize resource usage and maintain stability of core services. Gain insights into the custom extension developed for Apache Spark binaries, allowing dynamic allocation and hierarchical dominant resource fairness (HDRF) in multi-tenant environments. Understand how this solution enables users to leverage Volcano with Spark interactive mode in Jupyter notebooks and visualize scheduling metrics similar to the YARN UI.

Syllabus

Efficient Scheduling Of High Performance Batch Computing For... Krzysztof Adamski & Tinco Boekestijn


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Big Data Analytics
University of Adelaide via edX
Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera