Data Warehouse Engineering
Offered By: IBM via edX
Course Description
Overview
Data Warehouse Engineers and Business Analysts are in high demand as organizations become increasingly dependent on data to support their operations.
Data warehousing has transformed the way organizations perform business analysis and make strategic decisions. Massive amounts of data from multiple sources can be easily accessed using SQL and formatted for analysis, reporting and Business Intelligence for organizations to gain deeper business insights.
The Data Warehouse Engineer Professional Certificate provides you the skills and knowledge to design, deploy and manage Enterprise Data Warehouses (EDW) and utilize Business Intelligence tools to analyze and extract insights using reports and dashboards.
Upon completing this program, you’ll gain practical experience to work with Relational Database Management Systems (RDBMS), query data using SQL statements, utilize Linux/UNIX shell scripts to automate repetitive tasks, and build data pipelines using Apache Airflow and Kafka to Extract, Transform and Load (ETL) data. You’ll also acquire the skills to build and operationalize Data Warehouses and conduct data analysis.
Within each course you’ll practice your skills with numerous hands-on labs and multiple projects to add to your portfolio for launching your career.
To get started, all you need is basic computer literacy and the desire to learn and practice new skills.
Syllabus
Course 1: Data Engineering Basics for Everyone
Learn about data engineering concepts, ecosystem, and lifecycle. Also learn about the systems, processes, and tools you need as a Data Engineer in order to gather, transform, load, process, query, and manage data so that it can be leveraged by data consumers for operations, and decision-making.
Course 2: Relational Database Basics
This course teaches you the fundamental concepts of relational databases and Relational Database Management Systems (RDBMS) such as MySQL, PostgreSQL, and IBM Db2.
Course 3: Introduction to SQL
Learn how to use and apply the powerful language of SQL to better communicate and extract data from databases - a must for anyone working in Data Engineering, Data Analytics or Data Science.
Course 4: SQL Concepts for Data Engineers
In this short course you will learn additional SQL concepts such as views, stored procedures, transactions and joins.
Course 5: Linux Commands & Shell Scripting
This mini-course describes shell commands and how to use the advanced features of the Bash shell to automate complicated database tasks. For those not familiar with shell scripting, this course provides an overview of common Linux Shell Commands and shell scripting basics.
Course 6: Relational Database Administration (DBA)
This course helps you develop the foundational skills required to perform the role of a Database Administrator (DBA) including designing, implementing, securing, maintaining, troubleshooting and automating databases such as MySQL, PostgreSQL and Db2.
Course 7: Building ETL and Data Pipelines with Bash, Airflow and Kafka
This course provides you with practical skills to build and manage data pipelines and Extract, Transform, Load (ETL) processes using shell scripts, Airflow and Kafka.
Course 8: Data Warehousing and BI Analytics
This course introduces you to designing, implementing and populating a data warehouse and analyzing its data using SQL & Business Intelligence (BI) tools.
Courses
-
This mini-course provides a practical introduction to commonly used Linux / UNIX shell commands and teaches you basics of Bash shell scripting to automate a variety of tasks. The course includes both video-based lectures as well as hands-on labs to practice and apply what you learn. You will have no-charge access to a virtual Linux server that you can access through your web browser, so you don't need to download and install anything to perform the labs.
In this course you will work with general purpose commands, like id, date, uname, ps, top, echo, man; directory manageent commands such as pwd, cd, mkdir, rmdir, find, df; file management commands like cat, wget, more, head, tail, cp, mv, touch, tar, zip, unzip; access control command chmod; text processing commands - wc, grep, tr; as well as networking commands - hostname, ping, ifconfig and curl. You will create simple to intermediate shell scripts that involve Metacharacters, Quoting, Variables, Command substitution, I/O Redirection, Pipes & Filters, and Command line arguments. You will also schedule cron jobs using crontab.
This course provides essential hands-on skills for data engineers, data scientists, software developers, and cloud practitioners who want to get familiar with frequently used commands on Linux, MacOS and other Unix-like operating systems as well as get started with creating shell scripts.
-
Well-designed and automated data pipelines and ETL processes are the foundation of a successful Business Intelligence platform. Defining your data workflows, pipelines and processes early in the platform design ensures the right raw data is collected, transformed and loaded into desired storage layers and available for processing and analysis as and when required.
This course is designed to provide you the critical knowledge and skills needed by Data Engineers and Data Warehousing specialists to create and manage ETL, ELT, and data pipeline processes.
Upon completing this course you’ll gain a solid understanding of Extract, Transform, Load (ETL), and Extract, Load, and Transform (ELT) processes; practice extracting data, transforming data, and loading transformed data into a staging area; create an ETL data pipeline using Bash shell-scripting, build a batch ETL workflow using Apache Airflow and build a streaming data pipeline using Apache Kafka.
You’ll gain hands-on experience with practice labs throughout the course and work on a real-world inspired project to build data pipelines using several technologies that can be added to your portfolio and demonstrate your ability to perform as a Data Engineer.
This course pre-requisites that you have prior skills to work with datasets, SQL, relational databases, and Bash shell scripts.
-
Today’s businesses are investing significantly in capabilities to harness the massive amounts of data that fuel Business Intelligence (BI). Working knowledge of Data Warehouses and BI Analytics tools are a crucial skill for Data Engineers, Data Warehousing Specialists and BI Analysts, making who are amongst, the most valued resources for organizations.
This course prepares you with the skills and hands-on experience to design, implement and maintain enterprise data warehouse systems and business intelligence tools. You’ll gain extensive knowledge on various data repositories including data marts, data lakes and data reservoirs, explore data warehousing system architectures, deepen on data cubes and data organization using related tables. And analyze data using business intelligence like Cognos Analytics, including its reporting and dashboard features, and interactive visualization capabilities.
This course provides hands-on experience with practice labs and a real-world inspired project that can be added to your portfolio and will demonstrate your proficiency in working with data warehouses. Skills you will gain include building data warehouses, Star/Snowflake schemas, CUBEs, ROLLUPs, Materialized Views/MQTs, reports and dashboards.
This course assumes prior SQL and relational database experience.
-
Managing databases is a critical skill for Data Engineers and Database Administrators to ensure data is reliable, protected and easily accessible for organizations to make better decisions, solve problems and create business value.
With the amount of data continually expanding and business leaders focused on building data-literate organizations, it’s no surprise that Database Administrators are in high demand and earn a median salary of US $98,860 per year according to the US Bureau of Labor Statistics.
This course provides you with the knowledge and hands-on experience to manage and maintain databases, understand database security, design and define database schemas, tables, views, and other database objects, describe storage, perform backups and recovery, troubleshoot errors, monitor and optimize performance and automate tasks.
This course includes hands-on practice labs and a real-world inspired project to add to your portfolio that will demonstrate your ability to perform the Database Administration tasks using relational databases (RDBMSes) such as MySQL, PostgreSQL and IBM Db2.
Prior knowledge of database fundamentals and SQL is required to complete this course.
-
Much of the world's data lives in databases. SQL (or Structured Query Language) is a powerful programming language that is used for communicating with and manipulating data in databases. A working knowledge of databases and SQL is necessary for anyone who wants to start a career in Data Engineering, Data Analytics or Data Science. The purpose of this course is to introduce relational database (RDBMS) concepts and help you learn and apply foundational and intermediate knowledge of the SQL language.
You will start with performing basic Create, Read, Update and Delete (CRUD) operations using CREATE, SELECT, INSERT, UPDATE and DELETE statements. You will then learn to filter, order, sort, and aggregate data. You will also work with functions, perform sub-selects and nested queries, as well as access multiple tables in the database.
The emphasis in this course is on hands-on, practical learning. As such, you will work with real database systems, use real tools, and real-world datasets. You will create a database instance in the cloud. Through a series of hands-on labs, you will practice building and running SQL queries. At the end of the course you will apply and demonstrate your skills with a final project.
The SQL skills you learn in this course will be applicable to a variety of RDBMSes such as MySQL, PostgreSQL, IBM Db2, Oracle, SQL Server and others.
No prior knowledge of databases, SQL or programming is required, however some basic data literacy is beneficial.
Taught by
Rav Ahuja, Jeff Grossman, Ramesh Sannareddy, Lin Joyner, Rose Malcolm and Yan Luo
Tags
Related Courses
Organizational AnalysisStanford University via Coursera Introduction to Business in Asia
Griffith University via Open2Study Introduction to Enterprise Architecture
Enterprise Architects via Open2Study Business Processes: Modeling, Simulation, Execution
openHPI BI 4 Platform Innovation and Implementation
SAP Learning