Spark at Scale: Engineering Strategies for Data Science Workflows
Offered By: Data Science Festival via YouTube
Course Description
Overview
Explore engineering strategies for optimizing data science workflows using Spark in this 40-minute conference talk from the Data Science Festival Summer School 2023. Dive into a case study presented by Neil McCulloch, Data Science Engineer at dunnhumby, focusing on improving the performance of problematic PySpark applications. Learn how to slash runtimes in half for in-store availability reporting science. Gain insights into tackling large-scale data processing challenges and enhancing the efficiency of Spark-based data science projects. Discover practical approaches to optimize PySpark applications and streamline big data analytics workflows.
Syllabus
Spark at Scale: Engineering Strategies for Data Science Workflows
Taught by
Data Science Festival
Related Courses
Big Data Analytics in HealthcareGeorgia Institute of Technology via Udacity Mining Massive Datasets
Stanford University via edX The Caltech-JPL Summer School on Big Data Analytics
California Institute of Technology via Coursera Big Data Analytics for Healthcare
Georgia Institute of Technology via Coursera Data Lakes for Big Data
EdCast