SUMMARY: This project aims to construct a text classification model using a neural network and document the end-to-end steps using a template. The Sentiment Labelled Sentences dataset is a binary classification situation where we attempt to predict one of the two possible outcomes.

INTRODUCTION: This dataset was created for the research paper ‘From Group to Individual Labels using Deep Features,’ Kotzias et al., KDD 2015. The paper researchers randomly selected 500 positive and 500 negative sentences from a larger dataset of reviews for each website. The researcher also attempted to choose sentences with a positive or negative connotation as the goal was to avoid selecting neutral sentences.

In this Take1 iteration, we will deploy a bag-of-words model to classify the Amazon dataset’s review comments. We will also apply various sequence-to-matrix modes to evaluate the model’s performance.

ANALYSIS: In this Take1 iteration, the bag-of-words model’s performance achieved an average accuracy score of 77.31% after 25 epochs with ten iterations of cross-validation. Furthermore, the final model processed the test dataset with an accuracy measurement of 71.00%.

CONCLUSION: In this modeling iteration, the bag-of-words TensorFlow model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Sentiment Labelled Sentences

Dataset ML Model: Binary class text classification with text-oriented features

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Intel Image Classification dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: This dataset contains over 17,000 images of size 150×150 distributed under six categories: buildings, forest, glacier, mountain, sea, and street. There are approximately 14,000 images in the training set and 3,000 in the test/validation set. This dataset was initially published on https://datahack.analyticsvidhya.com by Intel as part of a data science competition.

From iteration Take1, we constructed a simple three-layer CNN neural network as the baseline model. We plan to use this model’s performance as the baseline measurement for future iterations of modeling.

From iteration Take2, we constructed a VGG16 neural network as an alternate model. We also compared this model’s performance with the baseline model from iteration Take1.

From iteration Take3, we constructed an InceptionV3 neural network as an alternate model. We also compared this model’s performance with the baseline model from iteration Take1.

From iteration Take4, we constructed a ResNet50V2 neural network as an alternate model. We also compared this model’s performance with the baseline model from iteration Take1.

In this Take5 iteration, we will construct a DenseNet201 neural network as an alternate model. We will compare this model’s performance with the baseline model from iteration Take1.

ANALYSIS: From iteration Take1, the baseline model’s performance achieved an accuracy score of 88.62% after 30 epochs using the training images. The baseline model also processed the validation images with an accuracy score of 85.37%.

From iteration Take2, the VGG16 model’s performance achieved an accuracy score of 83.57% after 30 epochs using the training images. The VGG16 model also processed the validation images with an accuracy score of 79.53%.

From iteration Take3, the InceptionV3 model’s performance achieved an accuracy score of 91.24% after 30 epochs using the training images. The InceptionV3 model also processed the validation images with an accuracy score of 87.10%.

From iteration Take4, the ResNet50V2 model’s performance achieved an accuracy score of 88.93% after 30 epochs using the training images. The ResNet50V2 model also processed the validation images with an accuracy score of 87.17%.

In this Take5 iteration, the DenseNet201 model’s performance achieved an accuracy score of 91.44% after 30 epochs using the training images. The DenseNet201 model also processed the validation images with an accuracy score of 87.27%.

CONCLUSION: In this iteration, the TensorFlow CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Intel Image Classification Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/puneet6060/intel-image-classification

One potential source of performance benchmarks: https://www.kaggle.com/puneet6060/intel-image-classification

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: This project aims to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: This algorithmic trading model examines a simple trend-following strategy for a stock. The model buys a stock when the price reaches the highest price for the last X number of days. The model will exit the position when the stock price crosses below the mean of the same window size.

From iteration Take1, we set up the models using one fixed window size for long trades only. The window size varied from 10 to 50 trading days at a 5-day increment.

From iteration Take2, we set up the models using one fixed window size for long trades only. The window size will vary from 10 to 50 trading days at a 5-day increment. The models also considered a volume indicator with its window size to confirm the buy/sell signal.

From iteration Take3, we set up the models using one fixed window size for long and short trades. The window size varied from 10 to 50 trading days at a 5-day increment.

From iteration Take4, we set up the models using one fixed window size for long and short trades. The window size will vary from 10 to 50 trading days at a 5-day increment. The models also considered a volume indicator with its window size to confirm the buy/sell signal.

In this Take5 iteration, we will set up the models using one fixed window size for long trades only. The window size will vary from 10 to 50 trading days at a 5-day increment. The models will also consider a volume indicator with a varying window size between 10 and 25 days to further confirm the buy/sell signal.

ANALYSIS: From iteration Take1, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 81.49 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

From iteration Take2, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 82.47 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

From iteration Take3, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 79.95 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

From iteration Take4, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 74.70 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

In this Take5 iteration, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 83.39 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

CONCLUSION: For the stock of AAPL during the modeling time frame, the trading strategy did not produce a better return than the buy-and-hold approach. We should consider modeling this stock further by experimenting with more variations of the strategy.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Intel Image Classification dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: This dataset contains over 17,000 images of size 150×150 distributed under six categories: buildings, forest, glacier, mountain, sea, and street. There are approximately 14,000 images in the training set and 3,000 in the test/validation set. This dataset was initially published on https://datahack.analyticsvidhya.com by Intel as part of a data science competition.

From iteration Take1, we constructed a simple three-layer CNN neural network as the baseline model. We plan to use this model’s performance as the baseline measurement for future iterations of modeling.

From iteration Take2, we constructed a VGG16 neural network as an alternate model. We also compared this model’s performance with the baseline model from iteration Take1.

From iteration Take3, we constructed an InceptionV3 neural network as an alternate model. We also compared this model’s performance with the baseline model from iteration Take1.

In this Take4 iteration, we will construct a ResNet50V2 neural network as an alternate model. We will compare this model’s performance with the baseline model from iteration Take1.

ANALYSIS: From iteration Take1, the baseline model’s performance achieved an accuracy score of 88.62% after 30 epochs using the training images. The baseline model also processed the validation images with an accuracy score of 85.37%.

From iteration Take2, the VGG16 model’s performance achieved an accuracy score of 83.57% after 30 epochs using the training images. The VGG16 model also processed the validation images with an accuracy score of 79.53%.

From iteration Take3, the InceptionV3 model’s performance achieved an accuracy score of 91.24% after 30 epochs using the training images. The InceptionV3 model also processed the validation images with an accuracy score of 87.10%.

In this Take4 iteration, the ResNet50V2 model’s performance achieved an accuracy score of 88.93% after 30 epochs using the training images. The ResNet50V2 model also processed the validation images with an accuracy score of 87.17%.

CONCLUSION: In this iteration, the TensorFlow CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Intel Image Classification Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/puneet6060/intel-image-classification

One potential source of performance benchmarks: https://www.kaggle.com/puneet6060/intel-image-classification

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Intel Image Classification dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: This dataset contains over 17,000 images of size 150×150 distributed under six categories: buildings, forest, glacier, mountain, sea, and street. There are approximately 14,000 images in the training set and 3,000 in the test/validation set. This dataset was initially published on https://datahack.analyticsvidhya.com by Intel as part of a data science competition.

From iteration Take1, we constructed a simple three-layer CNN neural network as the baseline model. We plan to use this model’s performance as the baseline measurement for future iterations of modeling.

From iteration Take2, we constructed a VGG16 neural network as an alternate model. We also compared this model’s performance with the baseline model from iteration Take1.

In this Take3 iteration, we will construct an InceptionV3 neural network as an alternate model. We will compare this model’s performance with the baseline model from iteration Take1.

ANALYSIS: From iteration Take1, the baseline model’s performance achieved an accuracy score of 88.62% after 30 epochs using the training images. The baseline model also processed the validation images with an accuracy score of 85.37%.

From iteration Take2, the VGG16 model’s performance achieved an accuracy score of 83.57% after 30 epochs using the training images. The VGG16 model also processed the validation images with an accuracy score of 79.53%.

In this Take3 iteration, the InceptionV3 model’s performance achieved an accuracy score of 91.24% after 30 epochs using the training images. The InceptionV3 model also processed the validation images with an accuracy score of 87.10%.

CONCLUSION: In this iteration, the TensorFlow CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Intel Image Classification Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/puneet6060/intel-image-classification

One potential source of performance benchmarks: https://www.kaggle.com/puneet6060/intel-image-classification

The HTML formatted report can be found here on GitHub.

]]>In this Take2 iteration, we will construct a VGG16 neural network as an alternate model. We will compare this model’s performance with the baseline model from iteration Take1.

In this Take2 iteration, the VGG16 model’s performance achieved an accuracy score of 83.57% after 30 epochs using the training images. The VGG16 model also processed the validation images with an accuracy score of 79.53%.

Dataset Used: Intel Image Classification Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/puneet6060/intel-image-classification

The HTML formatted report can be found here on GitHub.

]]>In this Take1 iteration, we will construct a simple three-layer CNN neural network as the baseline model. We will use this model’s performance as the baseline measurement for future iterations of modeling.

ANALYSIS: In this Take1 iteration, the baseline model’s performance achieved an accuracy score of 88.62% after 30 epochs using the training images. The baseline model also processed the validation images with an accuracy score of 85.37%.

Dataset Used: Intel Image Classification Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/puneet6060/intel-image-classification

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: This project aims to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: This algorithmic trading model examines a simple trend-following strategy for a stock. The model buys a stock when the price reaches the highest price for the last X number of days. The model will exit the position when the stock price crosses below the mean of the same window size.

From iteration Take1, we set up the models using one fixed window size for long trades only. The window size varied from 10 to 50 trading days at a 5-day increment.

From iteration Take2, we set up the models using one fixed window size for long trades only. The window size will vary from 10 to 50 trading days at a 5-day increment. The models also considered a volume indicator with its window size to confirm the buy/sell signal.

From iteration Take3, we set up the models using one fixed window size for long and short trades. The window size varied from 10 to 50 trading days at a 5-day increment.

In this Take4 iteration, we will set up the models using one fixed window size for long and short trades. The window size will vary from 10 to 50 trading days at a 5-day increment. The models also considered a volume indicator with its window size to confirm the buy/sell signal.

ANALYSIS: From iteration Take1, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 81.49 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

From iteration Take2, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 82.47 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

From iteration Take3, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 79.95 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

In this Take4 iteration, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 74.70 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

CONCLUSION: For the stock of AAPL during the modeling time frame, the trading strategy did not produce a better return than the buy-and-hold approach. We should consider modeling this stock further by experimenting with more variations of the strategy.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: This project aims to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: This algorithmic trading model examines a simple trend-following strategy for a stock. The model buys a stock when the price reaches the highest price for the last X number of days. The model will exit the position when the stock price crosses below the mean of the same window size.

From iteration Take1, we set up the models using one fixed window size for long trades only. The window size varied from 10 to 50 trading days at a 5-day increment.

From iteration Take2, we set up the models using one fixed window size for long trades only. The window size will vary from 10 to 50 trading days at a 5-day increment. The models also considered a volume indicator with its window size to confirm the buy/sell signal.

In this Take3 iteration, we will set up the models using one fixed window size for long and short trades. The window size will vary from 10 to 50 trading days at a 5-day increment.

ANALYSIS: From iteration Take1, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 81.49 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

From iteration Take2, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 82.47 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

In this Take3 iteration, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 79.95 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

CONCLUSION: For the stock of AAPL during the modeling time frame, the trading strategy did not produce a better return than the buy-and-hold approach. We should consider modeling this stock further by experimenting with more variations of the strategy.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

The HTML formatted report can be found here on GitHub.

]]>From iteration Take1, we set up the models using one fixed window size for each model. The window size varied from 10 to 50 trading days at a 5-day increment.

In this Take2 iteration, we will set up the models using one fixed window size for each model. The window size will vary from 10 to 50 trading days at a 5-day increment. In addition, the model will take into account a volume indicator with its own window size to confirm the buy/sell signal.

In this Take2 iteration, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2019, and December 24, 2020. The trading model produced a profit of 82.47 dollars per share. The buy-and-hold approach yielded a gain of 92.60 dollars per share.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

The HTML formatted report can be found here on GitHub.

]]>