YoVDO

Introduction to Big Data Analytics

Offered By: University of California, San Diego via Coursera

Tags

Big Data Courses Data Science Courses

Course Description

Overview

*********
A new, improved version of the Big Data Specialization will become available on June 6! As such, enrollment for this course and all courses in this original Big Data Specialization will close on June 6.

The original Big Data Specialization will continue to run until September 2016, when the Capstone will be offered for learners in this version of the Specialization.

If you are in the middle of the Specialization and have purchased the entire original Big Data Specialization before June 6, Coursera will reach out to you to offer you the option of staying in the original Specialization or taking the new version.

If you are just getting started on this Specialization, we recommend that you wait until June 6 to enroll in the new version.

*********


This course is for novice programmers or business people who'd like to understand more advanced tools used to wrangle and analyze big data. In this course you will be guided in basic approaches to querying and exploring data using higher level tools built on top of a Hadoop Platform. You will be walked through query interfaces, environments, and the canonical situations for tools like HBASE, HIVE, Pig, as well as more general tools like Spark-SQL. After this course you will be able to identify the kinds of analysis you can get of big data and how to interpret these results.

Syllabus

HBASE: Hadoop's database
In this Module we will talk about HBase - Hadoop’s database, a distributed, scalable, big data store.

HIVE: a Hadoop-based data warehouse
In this Module we will learn about Hive, the data warehousing infrastructure based on Hadoop that facilitates querying and managing large datasets residing in distributed storage.

PIG: A Dataflow Engine for Hadoop
Wouldn’t it be nice if you could quickly gather/sample/view and perform simple analysis on HDFS data? Then PIG might fly for you! This module will introduce the essential concepts and basic execution of PIG.

Splunk: Log Analysis and More
Welcome to the data analytics with Splunk module- this week, we’re going to go further in our quest to examine different ways and tools to deploy analytics on large dataset and gain valuable insight.

Spark for Analytics
Welcome to Spark for Analytics, this week we will learn about Spark SQL. Spark SQL provides a higher level interface to process your data and write more expressive code. We'll focus on data exploration, cleaning and plotting.


Taught by

Paul Rodriguez, Andrea Zonca and Natasha Balac

Tags

Related Courses

Data Analysis
Johns Hopkins University via Coursera
Computing for Data Analysis
Johns Hopkins University via Coursera
Scientific Computing
University of Washington via Coursera
Introduction to Data Science
University of Washington via Coursera
Web Intelligence and Big Data
Indian Institute of Technology Delhi via Coursera