YoVDO

Simplifying Testing of Spark Applications

Offered By: Linux Foundation via YouTube

Tags

Apache Spark Courses Python Courses Sentiment Analysis Courses Databricks Courses pandas Courses PySpark Courses User-Defined Functions Courses Kaggle Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore techniques for simplifying testing of Spark applications in this conference talk presented by Megan Yow from Sobeys and Han Wang from Lyft. Dive into the advantages of using Spark, compare Python Spark with pandas, and understand various scenarios involving Python UDFs, Pandas UDFs, and Pandas Type Hints. Learn about undecorated UDFs, the Functional API, and working with Spark DataFrames. Gain insights into using Databricks and Kaggle Notebooks for Spark development. Discover best practices for testing in Spark, including exploration goals and mindset. Watch a demonstration showcasing a notebook environment, tokenization techniques, and sentiment score analysis. This 46-minute presentation provides valuable knowledge for developers looking to enhance their Spark application testing skills.

Syllabus

Intro
Why use Spark
Python Spark
pandas
Python vs Spark
Scenarios
Python UDF
Pandas UDF
Pandas Type Hints
Undecorated UDFs
Functional API
Spark DataFrame
Databricks
Kaggle Notebook
Testing in Spark
Exploration Goals
Mindset
Demo
Notebook Environment
Tokenization
Sentiment Scores


Taught by

Linux Foundation

Tags

Related Courses

Artificial Intelligence for Robotics
Stanford University via Udacity
Intro to Computer Science
University of Virginia via Udacity
Design of Computer Programs
Stanford University via Udacity
Web Development
Udacity
Programming Languages
University of Virginia via Udacity