Getting Started with HDFS
Offered By: Pluralsight
Course Description
Overview
Learning to work with Hadoop Distributed File System (HDFS) is a baseline skill for anyone administering or developing in the Hadoop ecosystem. In this course, you will learn how to work with HDFS, Hive, Pig, Sqoop and HBase from the command line.
Getting Started with Hadoop Distributed File System (HDFS) is designed to give you everything you need to learn about how to use HDFS to read, store, and remove files. In addition to working with files in Hadoop, you will learn how to take data from relational databases and import it into HDFS using Sqoop. After we have our data inside HDFS, we will learn how to use Pig and Hive to query that data. Building on our HDFS skills, we will look at how use HBase for near real-time data processing. Whether you are a developer, administrator, or data analyst, the concepts in this course are essential to getting started with HDFS.
Getting Started with Hadoop Distributed File System (HDFS) is designed to give you everything you need to learn about how to use HDFS to read, store, and remove files. In addition to working with files in Hadoop, you will learn how to take data from relational databases and import it into HDFS using Sqoop. After we have our data inside HDFS, we will learn how to use Pig and Hive to query that data. Building on our HDFS skills, we will look at how use HBase for near real-time data processing. Whether you are a developer, administrator, or data analyst, the concepts in this course are essential to getting started with HDFS.
Syllabus
- Understanding HDFS 19mins
- Creating, Manipulating, and Retrieving HDFS Files 47mins
- Transferring Relational Data to HDFS Using Sqoop 22mins
- Querying Data with Pig and Hive 36mins
- Processing Sparse Data with HBase 24mins
- Automating Basic HDFS Operations 18mins
Taught by
Thomas Henson
Related Courses
Intro to Hadoop and MapReduceCloudera via Udacity Processing Big Data with Hadoop in Azure HDInsight
Microsoft via edX Implementing Real-Time Analytics with Hadoop in Azure HDInsight
Microsoft via edX Hadoop Platform and Application Framework
University of California, San Diego via Coursera Data Manipulation at Scale: Systems and Algorithms
University of Washington via Coursera