YoVDO

NLP with Python for Machine Learning Essential Training

Offered By: LinkedIn Learning

Tags

Natural Language Processing (NLP) Courses Machine Learning Courses Python Courses Regular Expressions Courses Data Cleaning Courses

Course Description

Overview

Explore natural language processing (NLP) concepts, review advanced data cleaning and vectorization techniques, and learn how to build machine learning classifiers.

Syllabus

Introduction
  • Welcome
  • What you should know
  • What tools do you need?
  • Using the exercise files
1. NLP Basics
  • What are NLP and NLTK?
  • NLTK setup and overview
  • Reading in text data
  • Exploring the dataset
  • What are regular expressions?
  • Learning how to use regular expressions
  • Regular expression replacements
  • Machine learning pipeline
  • Implementation: Removing punctuation
  • Implementation: Tokenization
  • Implementation: Removing stop words
2. Supplemental Data Cleaning
  • Introducing stemming
  • Using stemming
  • Introducing lemmatizing
  • Using lemmatizing
3. Vectorizing Raw Data
  • Introducing vectorizing
  • Count vectorization
  • N-gram vectorizing
  • Inverse document frequency weighting
4. Feature Engineering
  • Introducing feature engineering
  • Feature creation
  • Feature evaluation
  • Identifying features for transformation
  • Box-Cox power transformation
5. Building Machine Learning Classifiers
  • What is machine learning?
  • Cross-validation and evaluation metrics
  • Introducing random forest
  • Building a random forest model
  • Random forest with holdout test set
  • Random forest model with grid search
  • Evaluate random forest model performance
  • Introducing gradient boosting
  • Gradient-boosting grid search
  • Evaluate gradient-boosting model performance
  • Model selection: Data prep
  • Model selection: Results
Conclusion
  • Next steps

Taught by

Derek Jedamski

Related Courses

Data Wrangling with MongoDB
MongoDB via Udacity
Getting and Cleaning Data
Johns Hopkins University via Coursera
软件包在流行病学研究中的应用 Using software apps in epidemiological research
Peking University via Coursera
Creating an Analytical Dataset
Udacity
Implementing ETL with SQL Server Integration Services
Microsoft via edX