Data Warehouse Concepts, Design, and Data Integration
Offered By: University of Colorado System via Coursera
Course Description
Overview
This is the second course in the Data Warehousing for Business Intelligence specialization. Ideally, the courses should be taken in sequence.
In this course, you will learn exciting concepts and skills for designing data warehouses and creating data integration workflows. These are fundamental skills for data warehouse developers and administrators. You will have hands-on experience for data warehouse design and use open source products for manipulating pivot tables and creating data integration workflows. In the data integration assignment, you can use either Oracle, MySQL, or PostgreSQL databases. You will also gain conceptual background about maturity models, architectures, multidimensional models, and management practices, providing an organizational perspective about data warehouse development. If you are currently a business or information technology professional and want to become a data warehouse designer or administrator, this course will give you the knowledge and skills to do that. By the end of the course, you will have the design experience, software background, and organizational context that prepares you to succeed with data warehouse development projects.
In this course, you will create data warehouse designs and data integration workflows that satisfy the business intelligence needs of organizations. When you’re done with this course, you’ll be able to:
* Evaluate an organization for data warehouse maturity and business architecture alignment;
* Create a data warehouse design and reflect on alternative design methodologies and design goals;
* Create data integration workflows using prominent open source software;
* Reflect on the role of change data, refresh constraints, refresh frequency trade-offs, and data quality goals in data integration process design; and
* Perform operations on pivot tables to satisfy typical business analysis requests using prominent open source software
Syllabus
- Data Warehouse Concepts and Architectures
- Module 1 introduces the course and covers concepts that provide a context for the remainder of this course. In the first two lessons, you’ll understand the objectives for the course and know what topics and assignments to expect. In the remaining lessons, you will learn about historical reasons for development of data warehouse technology, learning effects, business architectures, maturity models, project management issues, market trends, and employment opportunities. This informational module will ensure that you have the background for success in later modules that emphasize details and hands-on skills.You should also read about the software requirements in the lesson at the end of module 1. I recommend that you try to install the software this week before assignments begin in week 2.
- Multidimensional Data Representation and Manipulation
- Now that you have conceptual background for data warehouse development, you’ll start using data warehouse tools. In module 2, you will learn about the multidimensional representation of a data warehouse used by business analysts. You’ll apply what you’ve learned in practice and graded problems using WebPivotTable, a web-based tool for manipulating pivot tables. At the end of this module, you will have solid background to communicate and assist business analysts who use a multidimensional representation of a data warehouse. To complete this module, you should proceed to the assignment and quiz involving WebPivotTable.
- Data Warehouse Design Practices and Methodologies
- This module emphasizes data warehouse design skills. Now that you understand the multidimensional representation used by business analysts, you are ready to learn about data warehouse design using a relational database. In practice, the multidimensional representation used by business analysts must be derived from a data warehouse design using a relational DBMS. You will learn about design patterns, summarizability problems, transformations for schema integration, and design methodologies. You will apply these concepts to mini case studies about data warehouse design. At the end of the module, you will have created data warehouse designs based on data sources and business needs of hypothetical organizations.
- Data Integration Concepts, Processes, and Techniques
- Module 4 extends your background about data warehouse development. After learning about schema design concepts and practices, you are ready to learn about data integration processing to populate and refresh a data warehouse. The informational background in module 4 covers concepts about data sources, data integration processes, and techniques for pattern matching and inexact matching of text. Module 4 provides detailed material about SQL statements for data integration with examples and an assignment for both Oracle Cloud and PostgreSQL. Module 4 provides a context for the software skills that you will learn in module 5.
- Architectures, Features, and Details of Data Integration Tools
- Module 5 extends your background about data integration from module 4. Module 5 covers architectures, features, and details about data integration tools to complement the conceptual background in module 4. You will learn about the features of two open source data integration tools, Talend Open Studio and Pentaho Data Integration. You will use Pentaho Data Integration in a guided tutorial in preparation for a graded assignment involving Pentaho Data Integration. For the tutorial and assignment, you need to connect to a database server, Oracle Cloud or PostgreSQL. If you have time, I recommend completing the data integration assignment using both Oracle Cloud and PostgreSQL.
Taught by
Michael Mannino
Tags
Related Courses
Data Science BasicsA Cloud Guru Introduction to Machine Learning
A Cloud Guru Address Business Issues with Data Science
CertNexus via Coursera Advanced Clinical Data Science
University of Colorado System via Coursera Advanced Data Science Capstone
IBM via Coursera