Simplifying Testing of Spark Applications
Offered By: Linux Foundation via YouTube
Course Description
Overview
Explore techniques for simplifying testing of Spark applications in this conference talk presented by Megan Yow from Sobeys and Han Wang from Lyft. Dive into the advantages of using Spark, compare Python Spark with pandas, and understand various scenarios involving Python UDFs, Pandas UDFs, and Pandas Type Hints. Learn about undecorated UDFs, the Functional API, and working with Spark DataFrames. Gain insights into using Databricks and Kaggle Notebooks for Spark development. Discover best practices for testing in Spark, including exploration goals and mindset. Watch a demonstration showcasing a notebook environment, tokenization techniques, and sentiment score analysis. This 46-minute presentation provides valuable knowledge for developers looking to enhance their Spark application testing skills.
Syllabus
Intro
Why use Spark
Python Spark
pandas
Python vs Spark
Scenarios
Python UDF
Pandas UDF
Pandas Type Hints
Undecorated UDFs
Functional API
Spark DataFrame
Databricks
Kaggle Notebook
Testing in Spark
Exploration Goals
Mindset
Demo
Notebook Environment
Tokenization
Sentiment Scores
Taught by
Linux Foundation
Tags
Related Courses
Computational Investing, Part IGeorgia Institute of Technology via Coursera Введение в машинное обучение
Higher School of Economics via Coursera Математика и Python для анализа данных
Moscow Institute of Physics and Technology via Coursera Introduction to Python for Data Science
Microsoft via edX Python for Data Science
University of California, San Diego via edX