HDInsight Deep Dive: Storm, HBase, and Hive
Offered By: Pluralsight
Course Description
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
HDInsight is Microsoft's managed Big Data stack in the cloud. With Azure you can provision clusters running Storm, HBase, and Hive which can process thousands of events per second, store petabytes of data, and give you a SQL-like interface to query it all. In this course, we'll build out a full solution using the stack and take a deep dive into each of the technologies.
Storm is a distributed compute platform which you can plug into Azure Event Hubs and use to power event stream processing. You can scale Storm to read tens of thousands of events per second and build a reliable workflow so that every event is guaranteed to be processed. HBase is a No-SQL database which is easy to get started with and can store tables with billions of rows and millions of columns. It's for real-time data access and it has a REST interface so you can read and write HBase data from a .NET Storm app. Hive is a data warehouse that provides a SQL-like interface over Big Data - HBase tables, and other sources. With Hive you can join across multiple sources and run queries from PowerShell and .NET. In this course, we use all three technologies running on Microsoft Azure to build a race timing solution and dive into performance tuning, reliability, and administration.
Storm is a distributed compute platform which you can plug into Azure Event Hubs and use to power event stream processing. You can scale Storm to read tens of thousands of events per second and build a reliable workflow so that every event is guaranteed to be processed. HBase is a No-SQL database which is easy to get started with and can store tables with billions of rows and millions of columns. It's for real-time data access and it has a REST interface so you can read and write HBase data from a .NET Storm app. Hive is a data warehouse that provides a SQL-like interface over Big Data - HBase tables, and other sources. With Hive you can join across multiple sources and run queries from PowerShell and .NET. In this course, we use all three technologies running on Microsoft Azure to build a race timing solution and dive into performance tuning, reliability, and administration.
Syllabus
- Architecting a Solution with HDInsight 12mins
- Storing Race Data in HBase 40mins
- HBase Deep Dive 43mins
- Processing Timing Events with Storm 38mins
- Storm Deep Dive 42mins
- Querying Race Data with Hive 38mins
- Hive Deep Dive 37mins
Taught by
Elton Stoneman
Related Courses
A Day in the Life of a Data EngineerAmazon Web Services via AWS Skill Builder A Day in the Life of a Data Engineer (Indonesian)
Amazon Web Services via AWS Skill Builder A Day in the Life of a Data Engineer (Japanese)
Amazon Web Services via AWS Skill Builder A Day in the Life of a Data Engineer (Korean)
Amazon Web Services via AWS Skill Builder A Day in the Life of a Data Engineer (Simplified Chinese)
Amazon Web Services via AWS Skill Builder