YoVDO

Cleaning Bad Data in R

Offered By: LinkedIn Learning

Tags

Data Cleaning Courses R Programming Courses Tidyverse Courses Data Integrity Courses Tidy Data Courses Data Formatting Courses

Course Description

Overview

Clean up your data in R. Learn how to identify and address data integrity issues such as missing and duplicate data, using R and the tidyverse.

Syllabus

Introduction
  • Data is messy
  • What you need to know
1. Missing Data
  • Types of missing data
  • Missing values
  • Missing rows
  • Aggregations and missing values
2. Duplicated Data
  • Duplicated rows and values
  • Aggregations in the data set
3. Formatting Data
  • Converting dates
  • Unit conversions
  • Numbers stored as text
  • Text improperly converted to numbers
  • Inconsistent spellings
4. Outliers
  • Screening for outliers
  • Handling outliers
  • Outliers use case
  • Outliers in subgroups
  • Detecting illogical values
5. Tidy Data
  • What is tidy data?
  • Variables, observations, and values
  • Common data problems
  • Wide vs. long data sets
  • Making wide data sets long
  • Making long data sets wide
6. Red Flags
  • Suspicious values
  • Suspicious multiples
Conclusion
  • What's next?

Taught by

Mike Chapple

Related Courses

Data Wrangling with MongoDB
MongoDB via Udacity
Getting and Cleaning Data
Johns Hopkins University via Coursera
软件包在流行病学研究中的应用 Using software apps in epidemiological research
Peking University via Coursera
Creating an Analytical Dataset
Udacity
Implementing ETL with SQL Server Integration Services
Microsoft via edX