Exploring Google Ngrams with Amazon EMR and Hive
Offered By: Amazon Web Services via AWS Skill Builder
Course Description
Overview
Languages Available: Español (Latinoamérica) | Español (España) | Français | Bahasa Indonesia | Italiano | 日本語 | 한국어 | Português (Brasil) | 中文(简体)
This lab demonstrates how to launch an Amazon Elastic MapReduce (EMR) cluster for Big Data processing and use Hive with SQL-style queries to analyze data. You will create a Hadoop cluster using Amazon EMR which will allow to run interactive Hive queries against data stored in Amazon S3. You will use Hive to normalize the data in a more useful way, and you will run queries to analyze the data.
Level
Advanced
Duration
1 Hours 15 MinutesCourse Objectives
In this course, you will learn how to:
- Create an Amazon EMR cluster running Hive
- Use Hive statements to create tables from Google Ngram input data stored in Amazon S3
- Run Hive queries to drill-down and analyze data
Intended Audience
This course is intended for:
- Architects
- Data Engineers
Prerequisites
We recommend that attendees of this course have the following prerequisites:
- None
Course Outline
- Task 1: Launch an Amazon EMR cluster
- Task 2: Connect to Your Cluster
- Task 3: Analyze Data
Tags
Related Courses
Getting Started with Amazon Simple Storage Service (S3)Amazon via Independent Deep Dive into Amazon Simple Storage Service (Amazon S3)
Amazon via Independent AWS Developer Series
Amazon via edX Crear y gestionar archivos con AWS S3
Coursera Project Network via Coursera Building Data Lakes on AWS
Amazon Web Services via Coursera