YoVDO

Simplifying Testing of Spark Applications

Offered By: Linux Foundation via YouTube

Tags

Apache Spark Courses Python Courses Sentiment Analysis Courses Databricks Courses pandas Courses PySpark Courses User-Defined Functions Courses Kaggle Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore techniques for simplifying testing of Spark applications in this conference talk presented by Megan Yow from Sobeys and Han Wang from Lyft. Dive into the advantages of using Spark, compare Python Spark with pandas, and understand various scenarios involving Python UDFs, Pandas UDFs, and Pandas Type Hints. Learn about undecorated UDFs, the Functional API, and working with Spark DataFrames. Gain insights into using Databricks and Kaggle Notebooks for Spark development. Discover best practices for testing in Spark, including exploration goals and mindset. Watch a demonstration showcasing a notebook environment, tokenization techniques, and sentiment score analysis. This 46-minute presentation provides valuable knowledge for developers looking to enhance their Spark application testing skills.

Syllabus

Intro
Why use Spark
Python Spark
pandas
Python vs Spark
Scenarios
Python UDF
Pandas UDF
Pandas Type Hints
Undecorated UDFs
Functional API
Spark DataFrame
Databricks
Kaggle Notebook
Testing in Spark
Exploration Goals
Mindset
Demo
Notebook Environment
Tokenization
Sentiment Scores


Taught by

Linux Foundation

Tags

Related Courses

Text Mining and Analytics
University of Illinois at Urbana-Champaign via Coursera
Introduction to Natural Language Processing
University of Michigan via Coursera
Enabling Technologies for Data Science and Analytics: The Internet of Things
Columbia University via edX
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
moocTLH: Nuevos retos en las tecnologĂ­as del lenguaje humano
Universidad de Alicante via MirĂ­adax