YoVDO

Apache Spark vs Databricks - On-Premise Performance Comparison

Offered By: Databricks via YouTube

Tags

Apache Spark Courses Data Science Courses Databricks Courses High Performance Computing Courses Service-Oriented Architecture Courses Air-Gapped Environments Courses

Course Description

Overview

Explore an on-premise comparison of Databricks and Open-Source Apache Spark in this 21-minute conference talk by Booz Allen's Cyber AI team. Discover how they achieved 10x performance gains on real-world cyber workloads using Databricks Runtime Environment in an air-gapped environment. Learn about the challenges and solutions for implementing data science in sensitive operations, including the service-oriented architecture for capability deployment and high-performance computing project architecture. Gain insights into the results of Spark Open Source vs Spark DBR and valuable lessons for future on-premise installations in data-sensitive environments.

Syllabus

Intro
The Challenge: Go Fast.... On-Premise?
Solution: A Service-Oriented Architecture for Capability Deployment
Project Architecture: Focused on High Performance Computing
Results: Spark Open Source vs Spark DBR
Lessons Learned for Future On-Premise Installs


Taught by

Databricks

Related Courses

Data Analysis
Johns Hopkins University via Coursera
Computing for Data Analysis
Johns Hopkins University via Coursera
Scientific Computing
University of Washington via Coursera
Introduction to Data Science
University of Washington via Coursera
Web Intelligence and Big Data
Indian Institute of Technology Delhi via Coursera