YoVDO

Topic Modeling in R

Offered By: DataCamp

Tags

R Programming Courses Data Visualization Courses Machine Learning Courses Text Analysis Courses ggplot2 Courses Topic Modeling Courses Latent Dirichlet Allocation Courses

Course Description

Overview

Learn how to fit topic models using the Latent Dirichlet Allocation algorithm.

This course introduces students to the areas involved in topic modeling: preparation of corpus, fitting of topic models using Latent Dirichlet Allocation algorithm (in package topicmodels), and visualizing the results using ggplot2 and wordclouds.

Syllabus

Quick introduction to the workflow
-This chapter introduces the workflow used in topic modeling: preparation of a document-term matrix, model fitting, and visualization of results with ggplot2.

Wordclouds, stopwords, and control arguments
-This chapter explains how to use join functions to remove or keep words in the document-term matrix, how to make wordcloud charts, and how to use some of the many control arguments.

Named entity recognition as unsupervised classification
-This chapter goes into detail on how LDA topic models can be used as classifiers. It covers the importance of the Dirichlet shape parameter alpha, construction of word contexts for named entities using regex, and technical issues like corpus alignment and held-out data.

How many topics is enough?
-This chapter explains the basic methods used in the search for the optimal number of topics. It also covers how to use a single document as a source of data, and how topic numbering can be controlled using seed words.


Taught by

Pavel Oleinikov

Related Courses

Introduction to Natural Language Processing in R
DataCamp
Introduction to Text Analysis in R
DataCamp
Introduction to Text Mining with R
Higher School of Economics via Coursera
Introduction to Topic Modelling in R
Coursera Project Network via Coursera
Natural Language Processing and Capstone Assignment
University of California, Irvine via Coursera