Big Data Analytics
Offered By: Queensland University of Technology via FutureLearn
Course Description
Overview
Big data is a fast-growing field and skills in the area are some of the most in demand today.
The Big Data Analytics program from Queensland University of Technology (QUT) comprises four online courses that each look at a different element of big data.
You’ll begin by examining how big data is collected and stored, before going on to explore how statistical inference, machine learning, mathematical modelling and data visualisation are used in its analysis.
You’ll become familiar with predictive analysis, dimension reduction, machine learning, clustering techniques and decision trees, before going on to look at the maths that underpins many of the tools you can use to manage and analyse big data.
Accessible for free on desktop, tablet or mobile and delivered in bite-sized chunks, the courses provide a flexible way to develop your big data analytics skills.
When you complete all four courses, upgrade and earn a Certificate of Achievement for each, you will receive a FutureLearn Award as proof of completing the program of study.
Syllabus
Courses under this program:
Course 1: Big Data: from Data to Decisions
-Get a practical insight into big data analytics, and popular tools and frameworks for collecting, storing and managing data.
Course 2: Big Data: Statistical Inference and Machine Learning
-Learn how to apply selected statistical and machine learning techniques and tools to analyse big data.
Course 3: Big Data: Mathematical Modelling
-Learn how to apply selected mathematical modelling methods to analyse big data in this free online course.
Course 4: Big Data: Data Visualisation
-Data visualisation is vital in bridging the gap between data and decisions. Discover the methods, tools and processes involved.
Courses
-
Many datasets can provide solutions to important problems and inform decisions. However, the size, complexity, quality and diversity of these datasets make them difficult to process and analyse. Join us and we’ll share new technological or methodological solutions you can use to meet the demand for analytics in your field.
We will introduce big data and some of the statistical and mathematical approaches for analysing it. Then we explore the power of big data and the process of getting from data to decisions before you use some of the tools available for storing and managing large datasets.
Our course is open to anyone with an interest in big data and is essential if you’re looking to add big data analytics to your skill set. A basic knowledge of software engineering, statistics and mathematics will help you gain the most from this learning experience.
The course includes optional, practical exercises designed to help you become familiar with some of the current tools used in big data analytics. If you would like to try the exercises you will need to install Cloudera, an open source platform for data management and analytics. We will email you detailed instructions explaining how to access a trial version and set up a virtual machine before the course starts.
-
Many people have big data but only some people know what to do with it. Why? Well, the big problem is that the data is big—the size, complexity and diversity of datasets increases every day. This means we need new solutions for analysing data.
This course equips you for working with these solutions by introducing you to selected statistical and machine learning techniques used for analysing large datasets and extracting information. We also expose you to three software packages so you can develop your coding skills by completing practical exercises.
You will enjoy this course most and benefit from the learning experience if you have a basic understanding of statistics and mathematics at a university undergraduate level.
You will be using the following free tools. Please review the product websites below to ensure your system meets the minimum requirements:
R and R Studio Desktop (open source edition)
You will complete practical exercises using R Studio, so you’ll need to be familiar enough with R to:
- install a package
- import data
- read and run starter code
- develop a solution or read through a solution and gain understanding from it.
NOTE: You must first have a working installation of R to use R Studio.
H2O Flow
H2O Flow can be used as a stand-alone package for big data analytics or can be used in conjunction with R. This package will allow you to tackle larger problems that you might encounter in your own work.WEKA
WEKA is a popular workbench for machine learning and statistical analysis. It comprises a very wide range of tools that are suitable for big data analysis.Knowing R, H2O Flow and WEKA will give you a powerful, flexible and scalable set of tools to manipulate and analyse big data.
-
Learn how mathematics underpins big data analysis and develop your skills.
Mathematics is everywhere, and with the rise of big data it becomes a useful tool when extracting information and analysing large datasets. We begin by explaining how maths underpins many of the tools that are used to manage and analyse big data. We show how very different applied problems can have common mathematical aims, and therefore can be addressed using similar mathematical tools. We then introduce three such tools, based on a linear algebra framework: eigenvalues and eigenvectors for ranking; graph Laplacian for clustering; and singular value decomposition for data compression.
This course is designed for anyone looking to add mathematical methods for data analytics to their skill set. We provide a multi-layered approach, so you can learn about the methods even if you don’t have a strong maths background, but we provide further information for those with a sound knowledge of undergraduate mathematics. We will assume basic MATLAB (or other) programming skills for some of the practical exercises.
MathWorks will provide you with free access to MATLAB Online for the duration of the course so you can complete the programming exercises. Please visit MATLAB Online to ensure your system meets the minimum requirements.
-
Data visualisation is an important visual method for effective communication and analysing large datasets. Through data visualisations we are able to draw conclusions from data that are sometimes not immediately obvious and interact with the data in an entirely different way.
This course will provide you with an informative introduction to the methods, tools and processes involved in visualising big data. We will also take the time to examine briefly the use of visualisation throughout history dating back as far as 17000 BC.
We have designed the course for people from different fields who want to learn how to produce visualisations that help us better understand real-world big data problems. You will gain the most from the practical exercises if you are comfortable with computer programming however you don’t need to have any prior experience using the software listed below.
We will use a variety of tools so that you become comfortable engaging with different software and confident trialing new packages to find those that best meet your needs. Please review the product websites below to ensure your system meets the minimum requirements for the tools we will be using.
- Tableau: You can use the free trial for a period of 2 weeks. Please do not start the trial until you are ready to do the Tableau exercises.
- MATLAB Online: MathWorks will provide you with a license to use MATLAB online for the duration of the course.
- D3.js: The D3 JavaScript library is available under BSD license.
You can still learn effectively even if you don’t have access to all of these tools as you will be able to see what they can do for you.
Taught by
Tomasz Bednarz, Kerrie Mengersen and Ian Turner
Tags
Related Courses
Data Modeling and Regression Analysis in BusinessUniversity of Illinois at Urbana-Champaign via Coursera Data Science: Inference and Modeling
Harvard University via edX Bayesian Data Analysis in Python
DataCamp Foundations of Inference in R
DataCamp Foundations of Probability in R
DataCamp