YoVDO

Apache PySpark by Example

Offered By: LinkedIn Learning

Tags

PySpark Courses Python Courses Apache Spark Courses

Course Description

Overview

Get up and running with Apache Spark quickly. This practical hands-on course shows Python users how to work with Apache PySpark to leverage the power of Spark for data science.

Syllabus

Introduction
  • Apache PySpark
  • What you should know
1. Introduction to Apache Spark
  • The Apache Spark ecosystem
  • Why Spark?
  • Spark origins and Databricks
  • Spark components
  • Partitions, transformations, lazy evaluations, and actions
2. Technical Setup
  • Set up the lab environment
  • Download a dataset
  • Importing
3. Working with the DataFrame API
  • The DataFrame API
  • Working with DataFrames
  • Schemas
  • Working with columns
  • Working with rows
  • Challenge
  • Solution
4. Functions
  • Built-in functions
  • Working with dates
  • User-defined functions
  • Working with joins
  • Challenge
  • Solution
5. Resilient Distributed Datasets (RDDs)
  • RDDs
  • Working with RDDs
Conclusion
  • Next steps

Taught by

Jonathan Fernandes

Related Courses

Artificial Intelligence for Robotics
Stanford University via Udacity
Intro to Computer Science
University of Virginia via Udacity
Design of Computer Programs
Stanford University via Udacity
Web Development
Udacity
Programming Languages
University of Virginia via Udacity