YoVDO

Cleaning Bad Data in R

Offered By: LinkedIn Learning

Tags

Data Cleaning Courses R Programming Courses Tidyverse Courses Data Integrity Courses Tidy Data Courses Data Formatting Courses

Course Description

Overview

Clean up your data in R. Learn how to identify and address data integrity issues such as missing and duplicate data, using R and the tidyverse.

Syllabus

Introduction
  • Data is messy
  • What you need to know
1. Missing Data
  • Types of missing data
  • Missing values
  • Missing rows
  • Aggregations and missing values
2. Duplicated Data
  • Duplicated rows and values
  • Aggregations in the data set
3. Formatting Data
  • Converting dates
  • Unit conversions
  • Numbers stored as text
  • Text improperly converted to numbers
  • Inconsistent spellings
4. Outliers
  • Screening for outliers
  • Handling outliers
  • Outliers use case
  • Outliers in subgroups
  • Detecting illogical values
5. Tidy Data
  • What is tidy data?
  • Variables, observations, and values
  • Common data problems
  • Wide vs. long data sets
  • Making wide data sets long
  • Making long data sets wide
6. Red Flags
  • Suspicious values
  • Suspicious multiples
Conclusion
  • What's next?

Taught by

Mike Chapple

Related Courses

Statistics One
Princeton University via Coursera
Introduction to Computational Finance and Financial Econometrics
University of Washington via Coursera
Curso Práctico de Bioestadística con R
Universidad San Pablo CEU via Miríadax
Análisis Estadístico de datos con R
Universidad Católica de Murcia via Miríadax
Data Analysis with R
Facebook via Udacity