Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The AIR Lab iBeans dataset is a multi-class classification situation where we are trying to predict one of several (more than two) possible outcomes.
INTRODUCTION: This dataset is of leaf images taken in the field in different districts in Uganda by the Makerere AI lab in collaboration with the National Crops Resources Research Institute (NaCRRI), Uganda’s national body in charge of agriculture research.
The goal is to build a robust machine learning model that can distinguish between diseases in the Bean plants. The data is of leaf images representing three classes: the healthy images and two disease classes, including Angular Leaf Spot and Bean Rust diseases. The model should be able to distinguish between these three classes with high accuracy. The end goal is to build a model that can be deployed on a mobile device and used in the field by a farmer.
From iteration Take1, we constructed and tuned machine learning models for this dataset using TensorFlow with a simple VGG-5 network. We also observed the best result that we could obtain using the validation and test datasets. The final output from this iteration became our baseline performance level for future iterations.
From iteration Take2, we constructed and tuned machine learning models for this dataset using a simple VGG-5 network with dropout regularizations of 0.2 for the convolution layers and 0.5 for the fully connected layers. The original VGG research paper inspired the dropout layers and ratios. We also observed the best result that we could obtain using the validation and test datasets.
From iteration Take3, we constructed and tuned machine learning models for this dataset using a simple VGG-5 network with image augmentation. We also observed the best result that we could obtain using the validation and test datasets.
In this Take4 iteration, we will construct and tune machine learning models for this dataset using a simple VGG-5 network with both the dropout layers and image augmentation. We will observe the best result that we can obtain using the validation and test datasets.
ANALYSIS: From iteration Take1, the performance of the baseline model achieved an accuracy score of 75.94% on the validation dataset after 50 epochs. Furthermore, the same baseline processed the test dataset with an accuracy score of 82.03%.
From iteration Take2, the performance of the baseline model achieved an accuracy score of 80.45% on the validation dataset after 100 epochs. Furthermore, the same baseline processed the test dataset with an accuracy score of 85.16%.
From iteration Take3, the performance of the baseline model achieved an accuracy score of 90.98% on the validation dataset after 100 epochs. Furthermore, the same baseline processed the test dataset with an accuracy score of 86.72%.
In this Take4 iteration, the performance of the baseline model achieved an accuracy score of 93.98% on the validation dataset after 200 epochs. Furthermore, the same baseline processed the test dataset with an accuracy score of 89.06%.
CONCLUSION: For this dataset, the model built using VGG-5 blocks with dropout layers and image augmentation performed adequately with the image datasets. We should consider using TensorFlow for further modeling and testing.
Dataset Used: AIR Lab iBeans Dataset
Dataset ML Model: Multi-classification with numerical attributes
Dataset Reference: https://github.com/AI-Lab-Makerere/ibean/
The HTML formatted report can be found here on GitHub.