YoVDO

Spark at Scale: Engineering Strategies for Data Science Workflows

Offered By: Data Science Festival via YouTube

Tags

Apache Spark Courses PySpark Courses Big Data Analytics Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore engineering strategies for optimizing data science workflows using Spark in this 40-minute conference talk from the Data Science Festival Summer School 2023. Dive into a case study presented by Neil McCulloch, Data Science Engineer at dunnhumby, focusing on improving the performance of problematic PySpark applications. Learn how to slash runtimes in half for in-store availability reporting science. Gain insights into tackling large-scale data processing challenges and enhancing the efficiency of Spark-based data science projects. Discover practical approaches to optimize PySpark applications and streamline big data analytics workflows.

Syllabus

Spark at Scale: Engineering Strategies for Data Science Workflows


Taught by

Data Science Festival

Related Courses

CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Big Data Analytics
University of Adelaide via edX
Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera