File size: 4,419 Bytes
9a5c0cc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
---
license: mit
language:
- en
---
# Stock Price Forecasting for Google (2004–2022)
This repository contains the implementation and results of stock price forecasting for Google's historical data (2004–2022) using two models: **ARIMA** and **Temporal Convolutional Network (TCN)**. The goal was to predict future stock prices and evaluate model performance using various metrics.
## Experiment Overview
The experiment aimed to forecast Google's stock prices using two distinct approaches:
1. **ARIMA**: A statistical time-series model optimized using the `pmdarima` library.
2. **Temporal Convolutional Network (TCN)**: A deep learning model designed for sequence modeling, tuned with a grid search over hyperparameters.
Both models were trained and evaluated on Google's stock price data from 2004 to 2022. The evaluation metrics include Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), R², and Direction Accuracy (for ARIMA).
## Methodology
### 1. ARIMA Model
- **Library**: Used `pmdarima` for automatic ARIMA order selection based on the Akaike Information Criterion (AIC).
- **Approach**: Employed a sliding window technique to evaluate different window sizes: `[30, 60, 90, 120, 180, 200, 250]` days.
- **Training**: For each window size, the model was trained on historical data and tested on the next time step.
- **Evaluation Metrics**:
- MAE (Log): Mean Absolute Error on log-transformed prices.
- RMSE (Log): Root Mean Squared Error on log-transformed prices.
- MAE (Price): Mean Absolute Error on raw stock prices.
- RMSE (Price): Root Mean Squared Error on raw stock prices.
- Direction Accuracy: Percentage of correct predictions for price movement direction.
- **Best Window**: Determined by the lowest RMSE (Price).
#### ARIMA Results
The best-performing window size was **90 days**, with the following metrics:
| Window Size | MAE (Log) | RMSE (Log) | MAE (Price) | RMSE (Price) | Direction Accuracy |
|-------------|-----------|------------|-------------|--------------|--------------------|
| 90 | 0.013572 | 0.019500 | 589.623929 | **767.501125** | 0.220339 |
| 250 | 0.013374 | 0.019387 | 591.532659 | 794.248966 | 0.325424 |
| 200 | 0.013437 | 0.019336 | 608.357138 | 805.855776 | 0.277966 |
| 30 | 0.013885 | 0.019906 | 595.957410 | 813.015790 | 0.170621 |
| 180 | 0.013556 | 0.019484 | 618.712818 | 820.254764 | 0.240678 |
| 60 | 0.013644 | 0.019658 | 662.900353 | 875.255252 | 0.198870 |
| 120 | 0.013619 | 0.019585 | 718.594515 | 919.414804 | 0.190960 |
**Best Window for RMSE (Price)**: 90 days (RMSE: 767.501125)
### 2. Temporal Convolutional Network (TCN)
- **Approach**: A TCN model was implemented with a grid search over multiple hyperparameters to identify the best configuration.
- **Hyperparameters**:
- Sequence Lengths: `[20, 50]`
- Batch Sizes: `[16, 32]`
- Learning Rates: `[0.001, 0.0005]`
- Kernel Sizes: `[3, 5]`
- Number of Channels: `[[32, 64, 128], [64, 128, 256]]`
- Dropout Rates: `[0.1, 0.2]`
- **Training**: The model was trained on all unique combinations of the hyperparameter grid, and performance was evaluated on a test set.
- **Evaluation Metrics**:
- MAE: Mean Absolute Error on stock prices.
- RMSE: Root Mean Squared Error on stock prices.
- MAPE: Mean Absolute Percentage Error.
- R²: Coefficient of determination.
#### TCN Results
The best-performing TCN configuration was:
- **Sequence Length**: 50
- **Batch Size**: 16
- **Learning Rate**: 0.0005
- **Kernel Size**: 3
- **Number of Channels**: [32, 64, 128]
- **Dropout**: 0.1
**Metrics for Best TCN Model**:
- MAE: 9.25931
- RMSE: 14.984981
- MAPE: 1.977077%
- R²: 0.999459953983791
The full results for all hyperparameter combinations are available in the `tcn_results.csv` file in the repository.
## Model Comparison
The TCN model significantly outperformed the ARIMA model across key metrics. Below is a comparison of the best configurations for each model:
| Model | MAE (Price) | RMSE (Price) | MAPE (%) | R² |
|-------------|-------------|--------------|----------|----------|
| ARIMA (90 days) | 589.623929 | 767.501125 | N/A | N/A |
| TCN (Best) | **9.25931** | **14.984981**| **1.977077** | **0.99946** | |