Multi-Class Image Classification Model for Vegetable Image Dataset Using TensorFlow Take 3

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Vegetable Image Dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: From vegetable production to delivery, common steps such as picking and sorting vegetables often occur manually. The research team wanted to improve the operation by developing a deep neural network model to detect and classify vegetables. That model can be implemented within different devices and can also solve other problems related to identifying vegetables, like labeling the vegetables automatically without any need for human work.

The initial experiment looked at 15 common vegetables found throughout the world. The vegetables chosen for the experimentation include bean, bitter gourd, bottle gourd, brinjal, broccoli, cabbage, capsicum, carrot, cauliflower, cucumber, papaya, potato, pumpkin, radish, and tomato. A total of 21000 images from 15 classes are used, where each class contains 1400 images of size 224×224 and in JPG format. The dataset split 70% for training, 15% for validation, and 15% for testing purposes.

ANALYSIS: The performance of the baseline model achieved an accuracy score of 95.53% after ten epochs using the validation datasets. After tuning the hyperparameters, the best model processed the validation dataset with an accuracy score of 99.70%. Furthermore, the final model processed the test dataset with an accuracy measurement of 98.67%.

CONCLUSION: In this iteration, the TensorFlow VGG19 CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Vegetable Image Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference:

One potential source of performance benchmarks:

The HTML formatted report can be found here on GitHub.