YoVDO

Building a Naive Bayes Text Classifier with scikit-learn

Offered By: EuroPython Conference via YouTube

Tags

EuroPython Courses Machine Learning Courses scikit-learn Courses Feature Extraction Courses Text Classification Courses Bag of Words Courses Naive Bayes Classifier Courses Spam Detection Courses TF-IDF Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 30-minute EuroPython Conference talk on building a Naive Bayes text classifier using scikit-learn. Learn about the algorithm's simplicity and effectiveness in classifying large, sparse datasets like text documents. Discover preprocessing techniques such as text normalization and feature extraction. Follow along as the speaker demonstrates model construction using the spam/ham YouTube comment dataset from the UCI repository. Gain insights into the Naive Bayes algorithm's history, advantages, and disadvantages. Dive into practical examples, equations, and implementation steps, including dataset loading, train/test splitting, and feature extraction using bag-of-words and TF-IDF approaches. Conclude with techniques for model evaluation and parameter tuning through Laplace smoothing.

Syllabus

Introduction
Naive Bayes: A Little History
Naive Bayes: Advantages and Disavantages
About the dataset: YouTube Spam Collection
Pre-requisites
Naive Bayes: An example
Naive Bayes: The Equation
Loading the Dataset
Train/test split
Feature extraction: Bag of words approach
Bag of words approach-Training
Bag of Words approach-Testing and Evaluation
Feature Extraction: TF-IDF Approach
TF-IDF Approach: Training
TF-IDF Approach: Testing and Evaluation
Tuning parameters: Laplace smoothing


Taught by

EuroPython Conference

Related Courses

Introduction to AI for Cybersecurity
Johns Hopkins University via Coursera
No-Code Machine Learning Using Amazon AWS SageMaker Canvas
Packt via Coursera
Deploy Machine Learning Model into AWS Cloud Servers
Coursera Project Network via Coursera
Let's Build A Forum with Laravel and TDD
Laracasts
Google Analytics: Spam Proofing
LinkedIn Learning