Machine Learning with PySpark: Data Analysis using SQL
Offered By: Coursera Project Network via Coursera
Course Description
Overview
This Guided Project is for beginning Python Developers. In this 1-hour long project-based course, you will learn how to Describe PySpark and Machine Learning, Use PySpark to Capture data, Use PySpark SQL to observe the data, Use PySpark MLlib to prepare training data, and Use PySpark MLlib to predict an outcome. To achieve this, we will work through using PySpark to read data into a PySpark Dataframe, View the Data using PysPark SQL, Prepare the Test and Training data using a heart disease data set, and attempt to predict heart disease using independent variables.
Syllabus
- Project Overview
- This Guided Project is for beginning Python Developers. In this 1-hour long project-based course, you will learn how to Describe PySpark and Machine Learning, Use PySpark to capture data, Use PySpark SQL to observe the data, Use PySpark MLlib to prepare training data, and Use PySpark MLlib to predict an outcome. To achieve this, we will work through using PySpark to read data into a PySpark Dataframe, View the Data using PysPark SQL, Prepare the Test and Training data using a heart disease data set, and attempt to predict heart disease using independent variables.
Taught by
David Dalsveen
Related Courses
Introduction to DatabasesMeta via Coursera Web Development
Udacity Introduction to Data Science
University of Washington via Coursera Datenmanagement mit SQL
openHPI Sabermetrics 101: Introduction to Baseball Analytics
Boston University via edX