SUMMARY: The purpose of this project is to practice web scraping by gathering specific pieces of information from a website. The web scraping code was written in Python and leveraged the Scrapy framework.
INTRODUCTION: David Lowe hosts his blog at merelydoit.blog. The purpose of this exercise is to practice web scraping by gathering the blog entries from Merely Do It’s RSS feed. This iteration of the script automatically traverses the RSS feed to capture all entries from the blog site.
Starting URLs: https://merelydoit.blog/feed or https://merelydoit.blog/feed/?paged=1
The source code and JSON output can be found here on GitHub.