Tower of Babel - Making Apache Spark, Kubeflow, and Kubernetes Play Nice
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore a conference talk that delves into the integration of Apache Spark, Kubeflow, and Kubernetes for big data matrix processing. Learn how to overcome the challenges of working with large-scale matrices that exceed the memory capacity of individual Kubernetes nodes. Discover how Apache Spark and Apache Mahout can be leveraged to distribute matrices across multiple pods and nodes, enabling processing of matrices of any dimension. Gain insights into using Kubeflow to enhance reproducibility and streamline workflows. Examine a real-world case study on denoising DICOM images of COVID patients' lungs, showcasing how these technologies can be combined to create a repeatable pipeline. Understand the potential impact of this approach in assisting doctors in resource-limited hospitals and advancing automated COVID detection research.
Syllabus
Tower of Babel: Making Apache Spark, Kubeflow, and Kubernetes Play Nice - Holden Karau, Netflix
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera