YoVDO

Simplifying Testing of Spark Applications

Offered By: Linux Foundation via YouTube

Tags

Apache Spark Courses Python Courses Sentiment Analysis Courses Databricks Courses pandas Courses PySpark Courses User-Defined Functions Courses Kaggle Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore techniques for simplifying testing of Spark applications in this conference talk presented by Megan Yow from Sobeys and Han Wang from Lyft. Dive into the advantages of using Spark, compare Python Spark with pandas, and understand various scenarios involving Python UDFs, Pandas UDFs, and Pandas Type Hints. Learn about undecorated UDFs, the Functional API, and working with Spark DataFrames. Gain insights into using Databricks and Kaggle Notebooks for Spark development. Discover best practices for testing in Spark, including exploration goals and mindset. Watch a demonstration showcasing a notebook environment, tokenization techniques, and sentiment score analysis. This 46-minute presentation provides valuable knowledge for developers looking to enhance their Spark application testing skills.

Syllabus

Intro
Why use Spark
Python Spark
pandas
Python vs Spark
Scenarios
Python UDF
Pandas UDF
Pandas Type Hints
Undecorated UDFs
Functional API
Spark DataFrame
Databricks
Kaggle Notebook
Testing in Spark
Exploration Goals
Mindset
Demo
Notebook Environment
Tokenization
Sentiment Scores


Taught by

Linux Foundation

Tags

Related Courses

Computational Investing, Part I
Georgia Institute of Technology via Coursera
Введение в машинное обучение
Higher School of Economics via Coursera
Математика и Python для анализа данных
Moscow Institute of Physics and Technology via Coursera
Introduction to Python for Data Science
Microsoft via edX
Python for Data Science
University of California, San Diego via edX