Multi-Label Deep Learning Model for Planet Understanding Amazon from Space Using TensorFlow Take 2

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Planet: Understanding the Amazon from Space dataset is a multi-label classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: Planet, designer and builder of the world’s largest constellation of Earth-imaging satellites collaborated with its Brazilian partner SCCON in challenging Kaggle participants to label satellite image chips with atmospheric conditions and various classes of land cover/land use. The resulting models will help the global community better understand deforestation conditions and how to respond to them.

The purpose of this modeling exercise is to construct an end-to-end template for solving multi-label machine learning problems. The series of scripting exercises will replicate Dr. Jason Brownlee’s blog post on this topic to build a robust template for future similar problems.

In iteration Take1, we constructed the necessary script segments to download and pre-process the image files available on Kaggle’s website.

In this Take2 iteration, we will construct the necessary script segments to train the TensorFlow model and evaluate its effectiveness.

ANALYSIS: In iteration Take1, we could successfully download and pre-process the image files from Kaggle.

In this Take2 iteration, the baseline model’s performance achieved a fbeta score of 0.8297 after 20 epochs using the validation dataset.

CONCLUSION: In this iteration, the TensorFlow model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Planet: Understanding the Amazon from Space

Dataset ML Model: Multi-label classification with numerical attributes

Dataset Reference:

One potential source of performance benchmarks:

The HTML formatted report can be found here on GitHub.