Unveiling Clustering in BERTopic Topic Modeling
Offered By: Conf42 via YouTube
Course Description
Overview
Explore the intricacies of clustering in BERTopic topic modeling through this 27-minute conference talk from Conf42 ML 2023. Delve into the world of topic modeling use cases, understand why BERTopic is a preferred choice, and examine its end-to-end flow. Learn about HDBSCAN clustering algorithm, its foundations in DBSCAN, and how it utilizes k-NN and minimum spanning trees to define density-based spatial clustering. Discover the concept of stability score "λ" and its role in determining final clusters. Analyze HDBSCAN's performance, strengths, and weaknesses through a practical demo and comprehensive explanation. Gain insights into future scope and access valuable references for further exploration of this powerful topic modeling technique.
Syllabus
intro
preface
who are we?
agenda
topic modeling use case
why bertopic?
bertopic end-to-end flow
clustering
dataset description
demo
what is hdbscan?
to understand hdbscan we need to know dbscan
what if there was no fixed radius?
k-nn algorithm to define radius
minimum spanning tree finds density and hierachy
density based spatial clustering
stability score "λ"
final clusters
hdbscan steps
hdbscan - performance comparison
hdbscan - strenghts and weaknesses
conclusion and future scope
references & ressources
thank you
Taught by
Conf42
Related Courses
Introduction to Machine LearningITMO University via edX Advanced Data Science Techniques in SPSS
Udemy Supervised Learning in R: Classification
DataCamp Machine Learning for Telecom Customers Churn Prediction
Coursera Project Network via Coursera Data Science: Supervised Machine Learning in Python
Udemy