Binary Classification Model for MiniBooNE Particle Identification Using AutoKeras

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The MiniBooNE Particle Identification dataset is a binary classification situation where we attempt to predict one of the two possible outcomes.

INTRODUCTION: This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background). The researchers set up the data file as follows. The first line is the number of signal events followed by the number of background events. The records with the signal events come first, followed by the background events. Each line, after the first line, has the 50 particle ID variables for one event.

ANALYSIS: In another TensorFlow modeling exercise, the baseline model (2 layers with 32 nodes each) achieved an accuracy score of 95.17% after 20 epochs using the training dataset. After tuning the hyperparameters, the best model (2 layers with 512 nodes each) processed the validation dataset with an accuracy score of 97.88%. Furthermore, the final model processed the previously unseen test dataset with an accuracy score of 94.40%.

After a series of modeling trials, the best AutoKeras model (2 layers with 256 and 32 nodes) processed the validation dataset with a maximum accuracy score of 94.64%. When we applied the AutoKeras model to the previously unseen test dataset, we obtained an accuracy score of 94.54%.

CONCLUSION: In this iteration, the best TensorFlow model generated by AutoKeras appeared to be suitable for modeling this dataset. We should consider experimenting with AutoKeras for further modeling.

Dataset Used: MiniBooNE Particle Identification Dataset

Dataset ML Model: Binary classification with numerical attributes

Dataset Reference:

The HTML formatted report can be found here on GitHub.