Solving Real World Data Science Tasks With Python Beautiful Soup - Movie Dataset Creation
Offered By: Keith Galli via YouTube
Course Description
Overview
Syllabus
- Video overview
- Check out DataCamp! sponsored
- Setup
Task #1: Scrape the infobox from Toy Story 3 wiki page save in python dictionary
Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- Robots.txt Are you allowed to scrape a site?
- Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- Save & Load dataset checkpoint JSON file
Task #3: Clean our data!
- Task #3.1: Strip out all references [1],[2],etc from HTML
- Task #3.2: Split up the long strings
- Task #3.3: Examine errors we are getting
- Task #3.4: Convert “Running time” field to an integer
- Task #3.5: Convert “Budget” & “Box office” fields to floats
- Task #3.6: Convert dates into datetime objects
- Saving our data again using Pickle
Task #4: Attach IMDB, Metascore, and Rotten Tomatoes scores to dataset working with APIs
Task #5: Save final dataset as a JSON file and as a CSV file
Taught by
Keith Galli
Related Courses
Data Wrangling with MongoDBMongoDB via Udacity Getting and Cleaning Data
Johns Hopkins University via Coursera 用Python玩转数据 Data Processing Using Python
Nanjing University via Coursera Introduction to NodeJS
Microsoft via edX 用 Python 做商管程式設計(三)(Programming for Business Computing in Python (3))
National Taiwan University via Coursera