AnnNaserNabil's picture
Update README.md
9a5c0cc verified
metadata
license: mit
language:
  - en

Stock Price Forecasting for Google (2004–2022)

This repository contains the implementation and results of stock price forecasting for Google's historical data (2004–2022) using two models: ARIMA and Temporal Convolutional Network (TCN). The goal was to predict future stock prices and evaluate model performance using various metrics.

Experiment Overview

The experiment aimed to forecast Google's stock prices using two distinct approaches:

  1. ARIMA: A statistical time-series model optimized using the pmdarima library.
  2. Temporal Convolutional Network (TCN): A deep learning model designed for sequence modeling, tuned with a grid search over hyperparameters.

Both models were trained and evaluated on Google's stock price data from 2004 to 2022. The evaluation metrics include Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), R², and Direction Accuracy (for ARIMA).

Methodology

1. ARIMA Model

  • Library: Used pmdarima for automatic ARIMA order selection based on the Akaike Information Criterion (AIC).
  • Approach: Employed a sliding window technique to evaluate different window sizes: [30, 60, 90, 120, 180, 200, 250] days.
  • Training: For each window size, the model was trained on historical data and tested on the next time step.
  • Evaluation Metrics:
    • MAE (Log): Mean Absolute Error on log-transformed prices.
    • RMSE (Log): Root Mean Squared Error on log-transformed prices.
    • MAE (Price): Mean Absolute Error on raw stock prices.
    • RMSE (Price): Root Mean Squared Error on raw stock prices.
    • Direction Accuracy: Percentage of correct predictions for price movement direction.
  • Best Window: Determined by the lowest RMSE (Price).

ARIMA Results

The best-performing window size was 90 days, with the following metrics:

Window Size MAE (Log) RMSE (Log) MAE (Price) RMSE (Price) Direction Accuracy
90 0.013572 0.019500 589.623929 767.501125 0.220339
250 0.013374 0.019387 591.532659 794.248966 0.325424
200 0.013437 0.019336 608.357138 805.855776 0.277966
30 0.013885 0.019906 595.957410 813.015790 0.170621
180 0.013556 0.019484 618.712818 820.254764 0.240678
60 0.013644 0.019658 662.900353 875.255252 0.198870
120 0.013619 0.019585 718.594515 919.414804 0.190960

Best Window for RMSE (Price): 90 days (RMSE: 767.501125)

2. Temporal Convolutional Network (TCN)

  • Approach: A TCN model was implemented with a grid search over multiple hyperparameters to identify the best configuration.
  • Hyperparameters:
    • Sequence Lengths: [20, 50]
    • Batch Sizes: [16, 32]
    • Learning Rates: [0.001, 0.0005]
    • Kernel Sizes: [3, 5]
    • Number of Channels: [[32, 64, 128], [64, 128, 256]]
    • Dropout Rates: [0.1, 0.2]
  • Training: The model was trained on all unique combinations of the hyperparameter grid, and performance was evaluated on a test set.
  • Evaluation Metrics:
    • MAE: Mean Absolute Error on stock prices.
    • RMSE: Root Mean Squared Error on stock prices.
    • MAPE: Mean Absolute Percentage Error.
    • R²: Coefficient of determination.

TCN Results

The best-performing TCN configuration was:

  • Sequence Length: 50
  • Batch Size: 16
  • Learning Rate: 0.0005
  • Kernel Size: 3
  • Number of Channels: [32, 64, 128]
  • Dropout: 0.1

Metrics for Best TCN Model:

  • MAE: 9.25931
  • RMSE: 14.984981
  • MAPE: 1.977077%
  • R²: 0.999459953983791

The full results for all hyperparameter combinations are available in the tcn_results.csv file in the repository.

Model Comparison

The TCN model significantly outperformed the ARIMA model across key metrics. Below is a comparison of the best configurations for each model:

Model MAE (Price) RMSE (Price) MAPE (%)
ARIMA (90 days) 589.623929 767.501125 N/A N/A
TCN (Best) 9.25931 14.984981 1.977077 0.99946