Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: This project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The MNIST Handwritten Digits dataset is an image classification situation where we attempt to predict one of several (more than two) possible outcomes.
INTRODUCTION: The MNIST problem is a dataset developed by Yann LeCun, Corinna Cortes, and Christopher Burges for evaluating machine learning models on the handwritten digit classification problem. The dataset was constructed from many scanned document datasets available from the National Institute of Standards and Technology (NIST). Each image is a 28 by 28-pixel square (784 pixels total). A standard split of the dataset is used to evaluate and compare models, where 60,000 images are used to train a model, and a separate set of 10,000 images are used to test it. It is a digit recognition task, so there are ten classes (0 to 9) to predict.
ANALYSIS: After a series of modeling trials, the AutoKeras system processed the validation dataset with an accuracy score of 94.84%. When we applied the best AutoKeras model to the previously unseen test dataset, we obtained an accuracy score of 98.4%.
CONCLUSION: In this iteration, the best TensorFlow model generated by AutoKeras appeared to be suitable for modeling this dataset. We should consider experimenting with AutoKeras for further modeling.
Dataset Used: MNIST Handwritten Digits Dataset
Dataset ML Model: Image regression modeling with numerical attributes
Dataset Reference: https://www.tensorflow.org/datasets/catalog/mnist
One potential source of performance benchmark: https://machinelearningmastery.com/how-to-develop-a-convolutional-neural-network-from-scratch-for-mnist-handwritten-digit-classification/
The HTML formatted report can be found here on GitHub.