YoVDO

Spark

Offered By: Udacity

Tags

Big Data Courses Machine Learning Courses DataFrames Courses Code Optimization Courses

Course Description

Overview

In this course, you’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark, the Python library for interacting with Spark. In the first lesson, you will learn about big data and how Spark fits into the big data ecosystem. In lesson two, you will be practicing processing and cleaning datasets to get comfortable with Spark’s SQL and dataframe APIs. In the third lesson, you will debug and optimize your Spark code when running on a cluster. In lesson four, you will use Spark’s Machine Learning Library to train machine learning models at scale.


Syllabus

  • The Power of Spark
    • Understand the big data ecosystem,Understand when to use Spark and when not to use it
  • Data Wrangling with Spark
    • Manipulate data with SparkSQL and Spark Dataframes,Use Spark for wrangling massive datasets
  • Debugging and Optimization
    • Troubleshoot common errors and optimize their code using the Spark WebUI
  • Machine Learning with Spark
    • Use Spark’s Machine Learning Library to train machine learning models at scale

Taught by

David Drummond and Judit Lantos

Related Courses

Analisis Data dengan Pemrograman R
Google via Coursera
Analíticas de Datos con Pandas
Tecnológico de Monterrey via Coursera
Spark Overview for Scala Analytics
Cognitive Class
Apache Spark with Scala – Hands-On with Big Data!
Packt via Coursera
Foundations of Data Analysis with Pandas and Python
Packt via Coursera