YoVDO

Analyze Big Data with Hadoop

Offered By: Amazon Web Services via AWS Skill Builder

Tags

Hadoop Courses Big Data Courses Amazon EC2 Courses Amazon S3 Courses Amazon EMR Courses Data Engineering Courses HiveQL Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Languages Available: Español (Latinoamérica) | Español (España) | Français | Bahasa Indonesia | Italiano | 日本語 | 한국어 | Português (Brasil) | 中文(简体)

In this lab, you will deploy a fully functional Hadoop cluster, ready to analyze log data in just a few minutes. You will start by launching an Amazon EMR cluster and then use a HiveQL script to process sample log data stored in an Amazon S3 bucket. HiveQL is a SQL-like scripting language for data warehousing and analysis. You can then use a similar setup to analyze your own log files.


Level

Fundamental


Duration

1 Hours 0 Minutes


Course Objectives

In this course, you will learn how to:

  • Launch a fully functional Hadoop cluster using **Amazon EMR**
  • Define the schema and create a table for sample log data stored in Amazon S3
  • Analyze the data using a **HiveQL** script and write the results back to Amazon S3
  • Download and view the results on your computer
  • Connect to the Hive CLI and run **HiveQL** query script to view the results


Intended Audience

This course is intended for:

  • Data Engineers

Prerequisites

We recommend that attendees of this course have the following prerequisites:

  • IT Experience: Prior experience with Hadoop is recommended, but not required, to complete this lab
  • AWS Experience: Basic familiarity with Amazon S3 and Amazon EC2 key pairs is suggested, but not required, to complete this project


Course Outline

  • Task 1: Create an Amazon S3 bucket
  • Task 2: Launch an Amazon EMR cluster
  • Task 3: Process Your Sample Data by Running a Hive Script
  • Task 4: View the Results
  • Task 5 : Connect to the EMR cluster CLI and perform query using HiveQL
  • Task 6: Terminate your Amazon EMR Cluster

Tags

Related Courses

Getting Started with Amazon Simple Storage Service (S3)
Amazon via Independent
Deep Dive into Amazon Simple Storage Service (Amazon S3)
Amazon via Independent
AWS Developer Series
Amazon via edX
Crear y gestionar archivos con AWS S3
Coursera Project Network via Coursera
Building Data Lakes on AWS
Amazon Web Services via Coursera