Getting Started with Hive for Relational Database Developers
Offered By: Pluralsight
Course Description
Overview
Traditional databases focus on transactional processing, whereas Hive helps with analytical processing extracted from huge datasets. This course focuses on the similarities and differences between SQL and Hive.
Transactional processing focuses on accessing and updating individual records. Analytical processing works on data in bulk and deals more with summaries across the dataset, trends and insights. The difference in requirements and the kind of data they work on, lead to differences between Hive and traditional databases. This course, Getting Started with Hive for Relational Database Developers, teaches you about several gotchas involved while using familiar SQL constructs in Hive. You'll learn about loading and parsing data from files, views, subqueries, and some cool built-in functionality such as table generating functions. The course also demonstrates the constraints imposed by Hive architecture choices such as schema on read, denormalized storage in HDFS, and high latency of operations. This serves as a guide for user choices during storage and querying. By the end of this course, you'll feel confident in using Hive for your own relational database uses.
Transactional processing focuses on accessing and updating individual records. Analytical processing works on data in bulk and deals more with summaries across the dataset, trends and insights. The difference in requirements and the kind of data they work on, lead to differences between Hive and traditional databases. This course, Getting Started with Hive for Relational Database Developers, teaches you about several gotchas involved while using familiar SQL constructs in Hive. You'll learn about loading and parsing data from files, views, subqueries, and some cool built-in functionality such as table generating functions. The course also demonstrates the constraints imposed by Hive architecture choices such as schema on read, denormalized storage in HDFS, and high latency of operations. This serves as a guide for user choices during storage and querying. By the end of this course, you'll feel confident in using Hive for your own relational database uses.
Syllabus
- Course Overview 2mins
- Hive vs. RDBMS 34mins
- Getting Started with Basic Queries in Hive 29mins
- Creating Databases and Tables 39mins
- Using Complex Data Types and Table Generating Functions 37mins
- Understanding Constraints in Subqueries and Views 21mins
- Designing Schema for Hive 11mins
Taught by
Janani Ravi
Related Courses
Introduction to DatabasesMeta via Coursera Web Development
Udacity Introduction to Data Science
University of Washington via Coursera Datenmanagement mit SQL
openHPI Sabermetrics 101: Introduction to Baseball Analytics
Boston University via edX