File size: 3,698 Bytes
af090b7
2072f33
af090b7
2072f33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af090b7
2072f33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
---

language: en
license: mit
library_name: sklearn
tags:
- sklearn
- gold-price-prediction
- time-series
- classification
- financial-prediction
datasets:
- custom
metrics:
- accuracy
- f1-score
- roc-auc
model-index:
- name: Gold Price Direction Predictor
  results:
  - task:
      type: classification
      name: Binary Classification
    dataset:
      type: custom
      name: Antam Gold Prices
    metrics:
    - type: accuracy
      value: 0.55  # Approximate from training
      name: Accuracy
    - type: f1
      value: 0.56  # Approximate
      name: F1 Score
    - type: roc_auc
      value: 0.58  # Approximate
      name: ROC AUC
---


# Gold Price Direction Predictor

This model predicts the next-day direction of gold prices (up or down) based on historical Antam gold price data and technical indicators.

## Model Description

- **Model Type**: Binary Classification (Gradient Boosting / XGBoost / LightGBM)
- **Task**: Predict whether gold price will go up or down the next day
- **Input**: Feature vector with technical indicators (returns, lags, RSI, MACD, Bollinger Bands, etc.)
- **Output**: Probability of price going up (0-1), thresholded at optimized value for prediction

## Intended Uses & Limitations

### Intended Uses
- Financial analysis and decision support
- Educational purposes for machine learning in finance
- Research on gold price prediction

### Limitations
- Trained on historical Antam gold prices only
- May not generalize to other markets or time periods
- Prediction accuracy is around 55-60% (better than random but not perfect)
- Requires up-to-date feature computation for real-time use

## How to Use

### Loading the Model

```python

from huggingface_hub import hf_hub_download

from joblib import load



# Download model

model_path = hf_hub_download("theonegareth/GoldPricePredictor", "gold_direction_model.joblib")

model = load(model_path)

```

### Making Predictions

The model expects a pandas DataFrame with the same feature columns used in training.

```python

import pandas as pd



# Example feature vector (you need to compute these from your data)

features = pd.DataFrame({

    'ret': [0.01],

    'log_ret': [0.00995],

    'ret_lag_1': [0.005],

    # ... all required features

})



# Predict probability of going up

proba_up = model.predict_proba(features)[:, 1]

prediction = (proba_up >= 0.52).astype(int)  # Using optimized threshold

```

### Feature Engineering

To use this model, you need to compute the same features from your gold price data:

- Daily returns and log returns
- Lagged returns (1-5 days)
- Rolling means and stds (3,5,10,20 days)
- RSI (14-day)
- MACD and signal
- Bollinger Bands
- Day of week and month

See the training notebooks for the complete `add_features_adaptive` function.

## Training Data

- Source: Antam historical gold prices (Indonesian market)
- Period: [Insert date range from your data]
- Features: 25+ technical indicators
- Target: Next-day price direction (up=1, down=0)

## Performance

Based on holdout testing:
- Accuracy: ~55%
- F1 Score: ~56%
- ROC AUC: ~58%

See the confusion matrix, ROC curve, and feature importance plots in the repository.

## Training Procedure

1. Data preprocessing and feature engineering
2. Time-series split for cross-validation
3. Hyperparameter tuning with RandomizedSearchCV
4. Model selection based on F1 score
5. Threshold optimization for final predictions

Models compared: Gradient Boosting, XGBoost, LightGBM

## Contact

For questions or issues, please open an issue on this repository.

## License

MIT License