Lab - Analyze and Prepare Data with Amazon SageMaker Data Wrangler and Amazon EMR

Offered By: Amazon Web Services via AWS Skill Builder

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

In this lab, you learn how to visualize, prepare data and transform a dataset in SageMaker Data Wangler. You will also use S3 and SageMaker Studio to interact with Apache Hive using Apache Spark.

Objectives

Understand effective methods for visualizing data
Explore methods for data cleaning and transformation and how to process missing values, outliers, duplicated data, etc.
Learn how to ingest and transform data into Amazon Sagemaker Data Wrangler
Experiment with how to transform data using Spark on Amazon EMR

Prerequisites

Basic navigation of the AWS Management Console.
An understanding of database concepts, MySQL, and database availability.

Outline

Task 1: Import, visualize, and perform a preliminary analysis of the data with SageMaker Data Wrangler
Task 2: Analyze and visualize the data
Task 3: Perform data transformations and export the datasets
Task 4: Set up the environment
Task 5: Connect to an EMR cluster
Task 6: Explore and query data from the SparkMagic PySpark kernel

Lab - Analyze and Prepare Data with Amazon SageMaker Data Wrangler and Amazon EMR

Tags

Course Description

Overview

Objectives

Prerequisites

Outline

Tags

Related Courses

Lab - Analyze and Prepare Data with Amazon SageMaker Data Wrangler and Amazon EMR

Tags

Course Description

Overview

Objectives

Prerequisites

Outline

Tags

Related Courses

Login to Continue