Efficient Scheduling of High Performance Batch Computing for Analytics Workloads with Volcano
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore how ING Wholesale Banking Advanced Analytics team implemented efficient scheduling for high-performance batch computing in analytics workloads using Volcano. Discover the journey of creating a centralized platform for internal data sources and large-scale computing, enabling over 300 internal projects and 2000 users to access advanced analytics capabilities. Learn about the implementation of a specialized cloud-native Kubernetes scheduler, Volcano, to optimize resource usage and maintain stability of core services. Gain insights into the custom extension developed for Apache Spark binaries, allowing dynamic allocation and hierarchical dominant resource fairness (HDRF) in multi-tenant environments. Understand how this solution enables users to leverage Volcano with Spark interactive mode in Jupyter notebooks and visualize scheduling metrics similar to the YARN UI.
Syllabus
Efficient Scheduling Of High Performance Batch Computing For... Krzysztof Adamski & Tinco Boekestijn
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Introduction to Data Science in PythonUniversity of Michigan via Coursera Julia Scientific Programming
University of Cape Town via Coursera Python for Data Science
University of California, San Diego via edX Probability and Statistics in Data Science using Python
University of California, San Diego via edX Introduction to Python: Fundamentals
Microsoft via edX