Scraping Data From the Web Course (How To)
Offered By: Treehouse
Course Description
Overview
Almost any information you want is available on the Internet. Web scraping is a key tool for data mining that information allowing for web page exploration and collection for a variety of reporting. The tools and techniques used in this course allow for data to be collected that would otherwise not be easily accessible without robotic assistance.
What you'll learn
- An introduction to the Beautiful Soup Python package
- How to scrape a web page with Beautiful Soup
- An introduction to the Scrapy Python package
- How to crawl a website with Scrapy
- Web scraping considerations
Syllabus
Introducing Data Scraping
A look at what data scraping is and how it is used. We'll have a discussion about how a web page is designed and look at the Python package, Beautiful Soup, to scrape data from the web.
Chevron 6 steps-
What is Data Scraping
2:54
-
Web Page Anatomy
3:47
-
Beautiful Soup
6:07
-
More Soup in the Tureen
6:18
-
Being a Good Citizen
3:07
-
Review Introduction to Data Scraping
5 questions
A World Full of Spiders
To go beyond scraping a single web page we need to crawl the web. Enter web crawlers, or spiders. We'll take a look the basics of crawling the web with Scrapy and talk about saving scraped data.
Chevron 5 steps-
Everyone Loves Charlotte
6:57
-
Installing Scrapy
3:18
-
Crawling Spiders
5:26
-
The Endless Web
6:58
-
A Review of Web Spiders
3 questions
Additional Scraping Tasks
Going beyond static web pages can be a challenge when scraping. Working with web forms and APIs can require a different approach. We'll also touch on how to write tests for a web scraper.
Chevron 6 steps-
An Intelligent Spider
5:40
-
Scraping APIs
6:40
-
Using Scrapers for Site Testing
5:52
- instruction
Common Issues with Data Scraping
-
Wrapping Up
0:37
-
Data Scraping Review
5 questions
Related Courses
Big DataUniversity of Adelaide via edX Advanced Data Mining with Weka
University of Waikato via FutureLearn AI For Lawyers (II): Tools for Legal Professionals
National Chiao Tung University via FutureLearn Graph Algorithms
University of California, San Diego via edX MinerĂa de datos aplicada al marketing
Universidad AnĂ¡huac via edX