YoVDO

Topic Modeling Workshop for Beginners in Python

Offered By: Prodramp via YouTube

Tags

Topic Modeling Courses Data Analysis Courses Python Courses Data Cleaning Courses Natural Language Toolkit (NLTK) Courses spaCy Courses

Course Description

Overview

Dive into a comprehensive 42-minute workshop on topic modeling in Python, combining Gensim, spaCy, NLTK, and other libraries. Learn to process NIPS papers through six key steps: data loading, preparation, exploratory analysis, modeling and tokenization, LDA model building, and evaluation. Master techniques like punctuation removal, word cloud creation, stop word elimination, bigram and trigram generation, lemmatization, and tokenization. Visualize topics and calculate coherence scores to assess model performance. Access the accompanying GitHub notebook for hands-on practice and follow along with detailed time stamps for each section of the tutorial.

Syllabus

- Tutorial Starts
- Topic Modeling Intro
- Workshop Environment
- Content location at GitHub
- Dataset used in this workshop
- LDA Intro
- Topic Modeling Use Cases
- 6 Steps in this Workshop
- Step 1: Loading Data
- Step 2: Data Preparation
- Step 2.1: Removing Punctuation
- Step 2.2: Removing digits and word with digits
- Step 2.3: Lowercase all context
- Step 3: EDA
- Step 3.1: Word Cloud
- Step 3.2: Document Term Matrix
- Step 4: Data Modeling
- Step 4.1: Stop words removal
- Step 4.2: Creating Bigram and Trigram
- Step 4.3: Lemmatization
- Step 4.4: Tokenization
- Step 5: LDA Topic Modeling
- Step 6: Topic Modeling Performance and analysis
- Step 6.1: Topic visualization
- Step 6.2: Coherence Score
- Saving notebook to GitHub
- Recap


Taught by

Prodramp

Related Courses

Social Network Analysis
University of Michigan via Coursera
Intro to Algorithms
Udacity
Data Analysis
Johns Hopkins University via Coursera
Computing for Data Analysis
Johns Hopkins University via Coursera
Health in Numbers: Quantitative Methods in Clinical & Public Health Research
Harvard University via edX