Multi-Class Image Classification Deep Learning Model for Kaggle UT Zappos50K Shoe Dataset Using TensorFlow Take 5

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Kaggle UT Zappos50K Shoe dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The dataset divided the photos into four major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands. The research team created this dataset in the context of an online shopping task, where users pay special attention to fine-grained visual differences.

In this Take1 iteration, we will construct a CNN model based on the InceptionV3 architecture to predict the shoe category based on the available images.

ANALYSIS: In this Take1 iteration, the InceptionV3 model’s performance achieved an accuracy score of 98.34% after ten epochs using the training dataset. The final model processed the validation dataset with an accuracy measurement of 87.28%.

CONCLUSION: In this iteration, the InceptionV3-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle UT Zappos50K Shoe Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.