YoVDO

Exploratory Data Analysis with Textual Data in R / Quanteda

Offered By: Coursera Project Network via Coursera

Tags

Data Analysis Courses Data Visualization Courses R Programming Courses Exploratory Data Analysis Courses ggplot2 Courses

Course Description

Overview

In this 1-hour long project-based course, you will learn how to explore presidential concession speeches by US presidential candidates over time, looking specifically at speech length and top words and examining variation by Democrat and Republican candidates. You will learn how to import textual data stored in raw text files, turn these files into a corpus (a collection of textual documents) and tokenize the text all using the software package quanteda. You will also learn how to extract useful information from filenames and how to use this information to generate visualizations of textual data using the stringr and ggplot2 packages. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.

Syllabus

  • Exploratory Data Analysis with Textual Data in R using Quanteda
    • By the end of this project, you will learn how to import textual data stored in raw text files into R, turn these files into a corpus (a collection of textual documents) and tokenize the text all using the R software package quanteda. You will also learn how to extract useful information from filenames and how to use this information to generate visualizations of textual data using the stringr and ggplot packages in R. At the end of this project, among other things you will explore presidential concession speeches by US presidential candidates over time, looking specifically at speech length and top words and examining variation by Democrat and Republican candidates. This guided project is for beginners interested in quantitative text analysis in R. It assumes no knowledge of textual analysis and focuses on exploring textual data (US Presidential Concession Speeches). Users should have a basic understanding of the statistical programming language R. By the end of the exercise, learners will know how to load textual data into R, summarize the data using descriptive quantities of interest, turn text into tokens, and visualize changes over time as well as top words.. Familiarity with R including stringr and ggplot is useful but not essential.

Taught by

Nicole Baerg

Related Courses

Introducción a Data Science: Programación Estadística con R
Universidad Nacional Autónoma de México via Coursera
Programming in R for Data Science
Microsoft via edX
Data Science: Visualization
Harvard University via edX
Анализ данных в R. Часть 2
Bioinformatics Institute via Stepik
Mastering Software Development in R
Johns Hopkins University via Coursera