Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The AIR Lab iBeans dataset is a multi-class classification situation where we are trying to predict one of several (more than two) possible outcomes.
INTRODUCTION: This dataset is of leaf images taken in the field in different districts in Uganda by the Makerere AI lab in collaboration with the National Crops Resources Research Institute (NaCRRI), Uganda’s national body in charge of agriculture research.
The goal is to build a robust machine learning model that can distinguish between diseases in the Bean plants. The data is of leaf images representing three classes: the healthy images and two disease classes, including Angular Leaf Spot and Bean Rust diseases. The model should be able to distinguish between these three classes with high accuracy. The end goal is to build a model that can be deployed on a mobile device and used in the field by a farmer.
From iteration Take1, we constructed and tuned machine learning models for this dataset using TensorFlow with a simple VGG-5 network. We also observed the best result that we could obtain using the validation and test datasets. The final output from this iteration became our baseline performance level for future iterations.
From iteration Take2, we constructed and tuned machine learning models for this dataset using a simple VGG-5 network with dropout regularizations of 0.2 for the convolution layers and 0.5 for the fully connected layers. The original VGG research paper inspired the dropout layers and ratios. We also observed the best result that we could obtain using the validation and test datasets.
In this Take3 iteration, we will construct and tune machine learning models for this dataset using a simple VGG-5 network with image augmentation. We will observe the best result that we can obtain using the validation and test datasets.
ANALYSIS: From iteration Take1, the performance of the baseline model achieved an accuracy score of 75.94% on the validation dataset after 50 epochs. Furthermore, the same baseline processed the test dataset with an accuracy score of 82.03%.
From iteration Take2, the performance of the baseline model achieved an accuracy score of 80.45% on the validation dataset after 100 epochs. Furthermore, the same baseline processed the test dataset with an accuracy score of 85.16%.
In this Take3 iteration, the performance of the baseline model achieved an accuracy score of 90.98% on the validation dataset after 100 epochs. Furthermore, the same baseline processed the test dataset with an accuracy score of 86.72%.
CONCLUSION: For this dataset, the model built using VGG-5 blocks and image augmentation performed adequately with the image datasets. However, we should consider tuning the model further by using other available regularization techniques.
Dataset Used: AIR Lab iBeans Dataset
Dataset ML Model: Multi-classification with numerical attributes
Dataset Reference: https://github.com/AI-Lab-Makerere/ibean/
The HTML formatted report can be found here on GitHub.