Cleaning Bad Data in R
Offered By: LinkedIn Learning
Course Description
Overview
Clean up your data in R. Learn how to identify and address data integrity issues such as missing and duplicate data, using R and the tidyverse.
Syllabus
Introduction
- Data is messy
- What you need to know
- Types of missing data
- Missing values
- Missing rows
- Aggregations and missing values
- Duplicated rows and values
- Aggregations in the data set
- Converting dates
- Unit conversions
- Numbers stored as text
- Text improperly converted to numbers
- Inconsistent spellings
- Screening for outliers
- Handling outliers
- Outliers use case
- Outliers in subgroups
- Detecting illogical values
- What is tidy data?
- Variables, observations, and values
- Common data problems
- Wide vs. long data sets
- Making wide data sets long
- Making long data sets wide
- Suspicious values
- Suspicious multiples
- What's next?
Taught by
Mike Chapple
Related Courses
Data Wrangling with MongoDBMongoDB via Udacity Getting and Cleaning Data
Johns Hopkins University via Coursera 软件包在流行病学研究中的应用 Using software apps in epidemiological research
Peking University via Coursera Creating an Analytical Dataset
Udacity Implementing ETL with SQL Server Integration Services
Microsoft via edX