SUMMARY: The purpose of this project is to practice web scraping by extracting specific pieces of information from a website. The web scraping Python code leverages the BeautifulSoup module.
INTRODUCTION: The Conference on Neural Information Processing Systems (NeurIPS) covers a wide range of topics in neural information processing systems and research for the biological, technological, mathematical, and theoretical applications. Neural information processing is a field that benefits from a combined view of biological, physical, mathematical, and computational sciences. This web scraping script will automatically traverse through the entire web page and collect all links to the PDF and PPTX documents. The script will also download the documents as part of the scraping process.
Starting URLs: https://papers.nips.cc/book/advances-in-neural-information-processing-systems-32-2019
The source code and HTML output can be found here on GitHub.