YoVDO

Apache PySpark by Example

Offered By: LinkedIn Learning

Tags

PySpark Courses Python Courses Apache Spark Courses

Course Description

Overview

Get up and running with Apache Spark quickly. This practical hands-on course shows Python users how to work with Apache PySpark to leverage the power of Spark for data science.

Syllabus

Introduction
  • Apache PySpark
  • What you should know
1. Introduction to Apache Spark
  • The Apache Spark ecosystem
  • Why Spark?
  • Spark origins and Databricks
  • Spark components
  • Partitions, transformations, lazy evaluations, and actions
2. Technical Setup
  • Set up the lab environment
  • Download a dataset
  • Importing
3. Working with the DataFrame API
  • The DataFrame API
  • Working with DataFrames
  • Schemas
  • Working with columns
  • Working with rows
  • Challenge
  • Solution
4. Functions
  • Built-in functions
  • Working with dates
  • User-defined functions
  • Working with joins
  • Challenge
  • Solution
5. Resilient Distributed Datasets (RDDs)
  • RDDs
  • Working with RDDs
Conclusion
  • Next steps

Taught by

Jonathan Fernandes

Related Courses

Design Computing: 3D Modeling in Rhinoceros with Python/Rhinoscript
University of Michigan via Coursera
A Practical Introduction to Test-Driven Development
LearnQuest via Coursera
FinTech for Finance and Business Leaders
ACCA via edX
Access Bioinformatics Databases with Biopython
Coursera Project Network via Coursera
Accounting Data Analytics
University of Illinois at Urbana-Champaign via Coursera