Introduction to Text Classification in R with quanteda
Offered By: Coursera Project Network via Coursera
Course Description
Overview
In this guided project you will learn how to import textual data stored in raw text files into R, turn these files into a corpus (a collection of textual documents), reshape them into paragraphs from documents and tokenize the text all using the R software package quanteda. You will then learn how to classify the texts using the Naive Bayes algorithm.
This guided project is for beginners interested in quantitative text analysis in R. It assumes no knowledge of textual analysis and focuses on exploring textual data (US Presidential Concession Speeches). Users should have a basic understanding of the statistical programming language R.
Syllabus
- Project Overview
- By the end of this project, you will learn how to import textual data stored in raw text files into R, turn these files into a corpus (a collection of textual documents), reshape them into paragraphs from documents and tokenize the text all using the R software package quanteda. You will then learn how to classify the texts using the Naive Bayes algorithm. At the end of this project, among other things, you will have imported documents, reshaped texts from documents to paragraphs, turned your texts into a machine readable format, and classified presidential concession speeches by political party. You will also learn to assess the accuracy of the predictions. This guided project is for beginners interested in quantitative text analysis in R. It assumes no knowledge of textual analysis and focuses on exploring textual data (US Presidential Concession Speeches). Users should have a basic understanding of the statistical programming language R. By the end of the exercise, learners will know how to load textual data into R, summarize the data using descriptive quantities of interest, turn text into tokens, and do simple text classification.
Taught by
Nicole Baerg
Related Courses
Statistics OnePrinceton University via Coursera Introduction to Computational Finance and Financial Econometrics
University of Washington via Coursera Curso Práctico de Bioestadística con R
Universidad San Pablo CEU via Miríadax Análisis Estadístico de datos con R
Universidad Católica de Murcia via Miríadax Data Analysis with R
Facebook via Udacity