Exploring Google Ngrams with Amazon EMR and Hive
Offered By: Amazon Web Services via AWS Skill Builder
Course Description
Overview
Languages Available: Español (Latinoamérica) | Español (España) | Français | Bahasa Indonesia | Italiano | 日本語 | 한국어 | Português (Brasil) | 中文(简体)
This lab demonstrates how to launch an Amazon Elastic MapReduce (EMR) cluster for Big Data processing and use Hive with SQL-style queries to analyze data. You will create a Hadoop cluster using Amazon EMR which will allow to run interactive Hive queries against data stored in Amazon S3. You will use Hive to normalize the data in a more useful way, and you will run queries to analyze the data.
Level
Advanced
Duration
1 Hours 15 MinutesCourse Objectives
In this course, you will learn how to:
- Create an Amazon EMR cluster running Hive
- Use Hive statements to create tables from Google Ngram input data stored in Amazon S3
- Run Hive queries to drill-down and analyze data
Intended Audience
This course is intended for:
- Architects
- Data Engineers
Prerequisites
We recommend that attendees of this course have the following prerequisites:
- None
Course Outline
- Task 1: Launch an Amazon EMR cluster
- Task 2: Connect to Your Cluster
- Task 3: Analyze Data
Tags
Related Courses
Web Intelligence and Big DataIndian Institute of Technology Delhi via Coursera Big Data for Better Performance
Open2Study Big Data and Education
Columbia University via edX Big Data Analytics in Healthcare
Georgia Institute of Technology via Udacity Data Mining with Weka
University of Waikato via Independent