Binary Classification Deep Learning Model for Cats and Dogs Using Keras Take 6

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The “Cats and Dogs” dataset is a binary classification situation where we are trying to predict one of the two possible outcomes.

INTRODUCTION: Web services are often protected with a challenge that’s supposed to be easy for people to solve, but difficult for computers. Such a challenge is often called a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) or HIP (Human Interactive Proof). ASIRRA (Animal Species Image Recognition for Restricting Access) is a HIP that works by asking users to identify photographs of cats and dogs. This task is difficult for computers, but studies have shown that people can accomplish it quickly and accurately.

The current literature suggests that machine classifiers can score above 80% accuracy on this task. Therefore, ASIRRA is no longer considered safe from attack. Kaggle created a contest to benchmark the latest computer vision and deep learning approaches to this problem. The training archive contains 25,000 images of dogs and cats. We will need to train our algorithm on these files and predict the correct labels for the test dataset.

In iteration Take1, we constructed a simple VGG convolutional model with 1 VGG block to classify the images. This model serves as the baseline for the future iterations of modeling.

In iteration Take2, we constructed a simple VGG convolutional model with 2 VGG blocks to classify the images. The additional modeling enabled us to improve our baseline model.

In iteration Take3, we constructed a simple VGG convolutional model with 3 VGG blocks to classify the images. The additional modeling enabled us to improve our baseline model further.

In iteration Take4, we applied dropout to our 3-VGG model. The addition of the dropout layers improved our model.

In iteration Take5, we applied image data augmentation to our VGG-3 model. The addition of the image data augmentation improved our model.

In this iteration, we will apply both dropout layers and image data augmentation to our VGG-3 model. We hope the addition of both techniques will further improve our model.

ANALYSIS: In iteration Take1, the performance of the Take1 model achieved an accuracy score of 95.55% after training for 20 epochs. The same model, however, processed the test dataset with an accuracy of only 72.99% after 20 epochs. Reviewing the plot, we can see that the model was starting to overfit the training dataset after only ten epochs. We will need to explore other modeling approaches to reduce the over-fitting.

In iteration Take2, the performance of the Take2 model achieved an accuracy score of 97.94% after training for 20 epochs. The same model, however, processed the test dataset with an accuracy of only 75.67% after 20 epochs. Reviewing the plot, we can see that the model was starting to overfit the training dataset after only seven epochs. We will need to explore other modeling approaches to reduce the over-fitting.

In iteration Take3, the performance of the Take3 model achieved an accuracy score of 97.14% after training for 20 epochs. The same model, however, processed the test dataset with an accuracy of only 80.19% after 20 epochs. Reviewing the plot, we can see that the model was starting to overfit the training dataset after only six epochs. We will need to explore other modeling approaches to reduce the over-fitting.

In iteration Take4, the performance of the Take4 model achieved an accuracy score of 86.92% after training for 50 epochs. The same model, however, processed the test dataset with an accuracy of 81.04% after 50 epochs. By reviewing the plot, this iteration indicated to us that having dropout layers can be a good tactic to improve the model’s predictive performance.

In iteration Take5, the performance of the Take5 model achieved an accuracy score of 87.52% after training for 50 epochs. The same model, however, processed the test dataset with an accuracy of 85.12% after 50 epochs. By reviewing the plot, this iteration indicated to us that having image data augmentation can be a good tactic to improve the model’s predictive performance.

In this iteration, the performance of the Take6 model achieved an accuracy score of 88.60% after training for 200 epochs. The same model, however, processed the test dataset with an accuracy of 87.25% after 200 epochs. By reviewing the plot, this iteration indicated to us that having both dropout layers and image data augmentation can create a low-variance model that does not overfit too early in the modeling process.

CONCLUSION: For this dataset, the model built using Keras and TensorFlow did not achieve a comparable result with the Kaggle competition. We should explore and consider more and different modeling approaches.

Dataset Used: Cats and Dogs Dataset

Dataset ML Model: Binary classification with numerical attributes

Dataset Reference: https://www.microsoft.com/en-us/download/details.aspx?id=54765

One potential source of performance benchmarks: https://www.kaggle.com/c/dogs-vs-cats/overview

The HTML formatted report can be found here on GitHub.