Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Intel Image Classification dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.
INTRODUCTION: This dataset contains over 17,000 images of size 150×150 distributed under six categories: buildings, forest, glacier, mountain, sea, and street. There are approximately 14,000 images in the training set and 3,000 in the test/validation set. This dataset was initially published on https://datahack.analyticsvidhya.com by Intel as part of a data science competition.
In this Take1 iteration, we will construct a simple three-layer CNN neural network as the baseline model. We will use this model’s performance as the baseline measurement for future iterations of modeling.
ANALYSIS: In this Take1 iteration, the baseline model’s performance achieved an accuracy score of 88.62% after 30 epochs using the training images. The baseline model also processed the validation images with an accuracy score of 85.37%.
CONCLUSION: In this iteration, the TensorFlow CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.
Dataset Used: Intel Image Classification Dataset
Dataset ML Model: Multi-class image classification with numerical attributes
Dataset Reference: https://www.kaggle.com/puneet6060/intel-image-classification
One potential source of performance benchmarks: https://www.kaggle.com/puneet6060/intel-image-classification
The HTML formatted report can be found here on GitHub.