SUMMARY: The purpose of this project is to practice web scraping by gathering specific pieces of information from a website. The web scraping code was written in Python and leveraged the BeautifulSoup module.
INTRODUCTION: Daines Analytics hosts its blog at dainesanalytics.blog. The purpose of this exercise is to practice web scraping by gathering the blog entries from Daines Analytics’ RSS feed. The script automatically traverses the RSS feed to capture all blog entries in a JSON document.
For this second iteration, the script also will store the captured information in a remote relational database.
Starting URLs: https://dainesanalytics.blog/feed or https://dainesanalytics.blog/feed/?paged=1
The source code and JSON output can be found here on GitHub.