YoVDO

Analyze Text Data with Yellowbrick

Offered By: Coursera Project Network via Coursera

Tags

Natural Language Processing (NLP) Courses Data Visualization Courses Machine Learning Courses Text Mining Courses

Course Description

Overview

Welcome to this project-based course on Analyzing Text Data with Yellowbrick. Tasks such as assessing document similarity, topic modelling and other text mining endeavors are predicated on the notion of "closeness" or "similarity" between documents. In this course, we define various distance metrics (e.g. Euclidean, Hamming, Cosine, Manhattan, etc) and understand their merits and shortcomings as they relate to document similarity. We will apply these metrics on documents within a specific corpus and visualize our results. By the end of this course, you will be able to confidently use visual diagnostic tools from Yellowbrick to steer your machine learning workflow, vectorize text data using TF-IDF, and cluster documents using embedding techniques and appropriate metrics. This course runs on Coursera's hands-on project platform called Rhyme. On Rhyme, you do projects in a hands-on manner in your browser. You will get instant access to pre-configured cloud desktops containing all of the software and data you need for the project. Everything is already set up directly in your internet browser so you can just focus on learning. For this project, you’ll get instant access to a cloud desktop with Python, Jupyter, Yellowbrick, and scikit-learn pre-installed. Notes: - You will be able to access the cloud desktop 5 times. However, you will be able to access instructions videos as many times as you want. - This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.

Syllabus

  • Project: Analyze Text Data with Yellowbrick
    • Welcome to this project-based course on Analyzing Text Data with Yellowbrick. Tasks such as assessing document similarity, topic modelling and other text mining endeavors are predicated on the notion of "closeness" or "similarity" between documents. In this course, we define various distance metrics (e.g. Euclidean, Hamming, Cosine, Manhattan, etc) and understand their merits and shortcomings as they relate to document similarity. We will apply these metrics on documents within a specific corpus and visualize our results.

Taught by

Snehan Kekre

Related Courses

Design Computing: 3D Modeling in Rhinoceros with Python/Rhinoscript
University of Michigan via Coursera
3D SARS-CoV-19 Protein Visualization With Biopython
Coursera Project Network via Coursera
A Simple Scatter Plot using D3 js
Coursera Project Network via Coursera
Access Bioinformatics Databases with Biopython
Coursera Project Network via Coursera
Accounting Data Analytics
University of Illinois at Urbana-Champaign via Coursera