Solving Real World Data Science Tasks With Python Beautiful Soup - Movie Dataset Creation
Offered By: Keith Galli via YouTube
Course Description
Overview
Syllabus
- Video overview
- Check out DataCamp! sponsored
- Setup
Task #1: Scrape the infobox from Toy Story 3 wiki page save in python dictionary
Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- Robots.txt Are you allowed to scrape a site?
- Task #2: Scrape infobox for all movies in List of Disney Films save as list of dictionaries
- Save & Load dataset checkpoint JSON file
Task #3: Clean our data!
- Task #3.1: Strip out all references [1],[2],etc from HTML
- Task #3.2: Split up the long strings
- Task #3.3: Examine errors we are getting
- Task #3.4: Convert “Running time” field to an integer
- Task #3.5: Convert “Budget” & “Box office” fields to floats
- Task #3.6: Convert dates into datetime objects
- Saving our data again using Pickle
Task #4: Attach IMDB, Metascore, and Rotten Tomatoes scores to dataset working with APIs
Task #5: Save final dataset as a JSON file and as a CSV file
Taught by
Keith Galli
Related Courses
Data AnalysisJohns Hopkins University via Coursera Computing for Data Analysis
Johns Hopkins University via Coursera Scientific Computing
University of Washington via Coursera Introduction to Data Science
University of Washington via Coursera Web Intelligence and Big Data
Indian Institute of Technology Delhi via Coursera