SUMMARY: The purpose of this project is to practice web scraping by extracting specific pieces of information from a website. The Python web scraping code leverages the Selenium module.
INTRODUCTION: Metro is a transportation planner and coordinator, designer, builder and operator for one of the country’s largest, most populous counties, Los Angeles. More than 9.6 million people, nearly one-third of California’s residents, live, work and play within its 1,433-square-mile service area. The purpose of this exercise is to practice web scraping by gathering the bus ridership statistics from the agency’s web pages. This iteration of the script automatically traverses the monthly web pages (from January 2009 to June 2020) to capture all bus ridership entries and store the information in a CSV output file.
Starting URLs: http://isotp.metro.net/MetroRidership/Index.aspx
The source code and HTML output can be found here on GitHub.