Spark at Scale: Engineering Strategies for Data Science Workflows
Offered By: Data Science Festival via YouTube
Course Description
Overview
Explore engineering strategies for optimizing data science workflows using Spark in this 40-minute conference talk from the Data Science Festival Summer School 2023. Dive into a case study presented by Neil McCulloch, Data Science Engineer at dunnhumby, focusing on improving the performance of problematic PySpark applications. Learn how to slash runtimes in half for in-store availability reporting science. Gain insights into tackling large-scale data processing challenges and enhancing the efficiency of Spark-based data science projects. Discover practical approaches to optimize PySpark applications and streamline big data analytics workflows.
Syllabus
Spark at Scale: Engineering Strategies for Data Science Workflows
Taught by
Data Science Festival
Related Courses
Fundamentals of Scalable Data ScienceIBM via Coursera Data Science and Engineering with Spark
Berkeley University of California via edX Master of Machine Learning and Data Science
Imperial College London via Coursera Data Analysis Using Pyspark
Coursera Project Network via Coursera Building Machine Learning Pipelines in PySpark MLlib
Coursera Project Network via Coursera