Regression Deep Learning Model for Song Year Prediction Using TensorFlow Take 3

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a prediction model using various machine learning algorithms and to document the end-to-end steps using a template. The Song Year Prediction dataset is a classic regression situation where we are trying to predict the value of a continuous variable.

INTRODUCTION: This data is a subset of the Million Song Dataset, http://labrosa.ee.columbia.edu/millionsong/, a collaboration between LabROSA (Columbia University) and The Echo Nest. The purpose of this exercise is to predict the release year of a song from audio features. Songs are mostly western, commercial tracks ranging from 1922 to 2011, with a peak in the year 2000s. The data preparer recommended the train/test split of the first 463,715 examples for training and the last 51,630 examples for testing. This approach avoids the ‘producer effect’ by making sure no song from a given artist ends up in both the train and test set.

In iteration Take1, we constructed several Multilayer Perceptron (MLP) models with one hidden layer of 16, 32, 64, and 128 nodes. The single-layer MLP model serves as the baseline model as we build more complex MLP models in future iterations.

In iteration Take2, we constructed several Multilayer Perceptron (MLP) models with two hidden layers. We observed the effects of having an additional layer in our MLP models.

In this Take3 iteration, we will construct several Multilayer Perceptron (MLP) models with three hidden layers. We will observe whether the additional layers can improve the RMSE as we build more complex MLP models in future iterations.

ANALYSIS: In iteration Take1, all models processed the test dataset and produced RMSEs that are around 9.50. However, the single-layer models do not exhibit a stable curve when making predictions with the test dataset.

In iteration Take2, all models processed the test dataset and again produced RMSEs that are around 9.50. Moreover, the dual-layer models also did not exhibit a stable curve when making predictions with the test dataset.

In this Take3 iteration, all models processed the test dataset and again produced RMSEs that are around 9.50. Moreover, the tri-layer models also did not exhibit a stable curve when making predictions with the test dataset.

CONCLUSION: For this iteration, the different model architectures produced similar RMSE. For this dataset, we should consider experimenting more MLP models with some regularization techniques.

Dataset Used: YearPredictionMSD Dataset

Dataset ML Model: Regression with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/YearPredictionMSD

Thierry Bertin-Mahieux, Daniel P.W. Ellis, Brian Whitman, and Paul Lamere. The Million Song Dataset. In Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), 2011.

One potential source of performance benchmarks: https://www.kaggle.com/uciml/msd-audio-features/home

The HTML formatted report can be found here on GitHub.