Scraping Data From the Web Course (How To)
Offered By: Treehouse
Course Description
Overview
Almost any information you want is available on the Internet. Web scraping is a key tool for data mining that information allowing for web page exploration and collection for a variety of reporting. The tools and techniques used in this course allow for data to be collected that would otherwise not be easily accessible without robotic assistance.
What you'll learn
- An introduction to the Beautiful Soup Python package
- How to scrape a web page with Beautiful Soup
- An introduction to the Scrapy Python package
- How to crawl a website with Scrapy
- Web scraping considerations
Syllabus
Introducing Data Scraping
A look at what data scraping is and how it is used. We'll have a discussion about how a web page is designed and look at the Python package, Beautiful Soup, to scrape data from the web.
Chevron 6 steps-
What is Data Scraping
2:54
-
Web Page Anatomy
3:47
-
Beautiful Soup
6:07
-
More Soup in the Tureen
6:18
-
Being a Good Citizen
3:07
-
Review Introduction to Data Scraping
5 questions
A World Full of Spiders
To go beyond scraping a single web page we need to crawl the web. Enter web crawlers, or spiders. We'll take a look the basics of crawling the web with Scrapy and talk about saving scraped data.
Chevron 5 steps-
Everyone Loves Charlotte
6:57
-
Installing Scrapy
3:18
-
Crawling Spiders
5:26
-
The Endless Web
6:58
-
A Review of Web Spiders
3 questions
Additional Scraping Tasks
Going beyond static web pages can be a challenge when scraping. Working with web forms and APIs can require a different approach. We'll also touch on how to write tests for a web scraper.
Chevron 6 steps-
An Intelligent Spider
5:40
-
Scraping APIs
6:40
-
Using Scrapers for Site Testing
5:52
- instruction
Common Issues with Data Scraping
-
Wrapping Up
0:37
-
Data Scraping Review
5 questions
Related Courses
Artificial Intelligence for RoboticsStanford University via Udacity Intro to Computer Science
University of Virginia via Udacity Design of Computer Programs
Stanford University via Udacity Web Development
Udacity Programming Languages
University of Virginia via Udacity