Binary-Class Image Classification Deep Learning Model for PatchCamelyon Grand Challenge Using TensorFlow Take 3

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The PatchCamelyon Grand Challenge dataset is a binary-class classification situation where we attempt to predict one of two possible outcomes.

INTRODUCTION: The PatchCamelyon benchmark is a new and challenging image classification dataset. It consists of 327680 color images (96 x 96px) extracted from histopathologic scans of lymph node sections. Each image is annotated with a binary label indicating the presence of metastatic tissue. This dataset provides a useful benchmark for machine learning models that are bigger than CIFAR10 but smaller than ImageNet.

n iteration Take1, we constructed a CNN model using a simple three-block VGG architecture and tested the model’s performance using a held-out test dataset.

In iteration Take2, we constructed a CNN model using the InceptionV3 architecture and tested the model’s performance using a held-out test dataset.

In this Take3 iteration, we will construct a CNN model using the ResNet50 architecture and test the model’s performance using a held-out test dataset.

ANALYSIS: In iteration Take1, the baseline model’s performance achieved an accuracy score of 79.83% on the validation dataset after ten epochs. After we apply the final model to the test dataset, the model achieved an accuracy score of 79.00%.

In iteration Take2, the InceptionV3 model’s performance achieved an accuracy score of 83.74% on the validation dataset after ten epochs. After we apply the final model to the test dataset, the model achieved an accuracy score of 79.00%.

In this Take3 iteration, the ResNet50 model’s performance achieved an accuracy score of 85.09% on the validation dataset after ten epochs. After we apply the final model to the test dataset, the model achieved an accuracy score of 78.05%.

CONCLUSION: In this iteration, the ResNet50 CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: PatchCamelyon Grand Challenge

Dataset ML Model: Binary-class image classification with numerical attributes

Dataset Reference: https://patchcamelyon.grand-challenge.org/

A potential source of performance benchmarks: https://patchcamelyon.grand-challenge.org/evaluation/challenge/leaderboard/

The HTML formatted report can be found here on GitHub.