Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: The project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Sign Language MNIST dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.
INTRODUCTION: The original MNIST image dataset of handwritten digits is a popular benchmark for image-based machine learning methods. The Sign Language MNIST is presented here and follows the same CSV format with labels and pixel values in single rows to stimulate the community to develop more drop-in replacements. The American Sign Language letter database of hand gestures represent a multi-class problem with 24 classes of letters (excluding J and Z, which require motion).
The dataset format is patterned to match closely with the classic MNIST. Each training and test case represents a label (0-25) as a one-to-one map for each alphabetic letter A-Z (and no cases for 9=J or 25=Z because of gesture motions). The training data (27,455 cases) and test data (7172 instances) are approximately half the size of the standard MNIST but otherwise similar with a header row of the labels, pixel1,pixel2….pixel784 which represent a single 28×28 pixel image with grayscale values between 0-255. The original hand gesture image data represented multiple users repeating the gesture against different backgrounds.
In iteration Take1, we constructed a Multilayer Perceptron (MLP) model with five hidden layers to model this dataset.
In this Take2 iteration, we will construct a Convolutional Neural Network (CNN) model with five hidden layers to model this dataset.
ANALYSIS: From iteration Take1, the MLP model’s performance achieved an accuracy score of 94.93% after 50 epochs using the training dataset. The same model processed the test dataset with an accuracy score of 77.93%, which pointed to a high variance error.
In this Take2 iteration, the CNN model’s performance achieved an accuracy score of 99.94% after 25 epochs using the training dataset. The same model processed the test dataset with an accuracy score of 99.55%, which pointed to a high variance error.
CONCLUSION: In this iteration, the TensorFlow CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.
Dataset Used: Sign Language MNIST Dataset
Dataset ML Model: Multi-class classification with numerical attributes
Dataset Reference: https://www.kaggle.com/datamunge/sign-language-mnist
One source of potential performance benchmarks: https://www.kaggle.com/datamunge/sign-language-mnist
One potential source of performance benchmarks: https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/leaderboard
The HTML formatted report can be found here on GitHub.