File size: 4,326 Bytes
529bf3b 51ac139 529bf3b 51ac139 529bf3b 610e83c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
---
license: mit
language: en
library_name: stable-baselines3
repo_url: https://github.com/JonusNattapong/Reinforcement-Learning-for-Gold-Trading
tags:
- reinforcement-learning
- finance
- gold-trading
- xauusd
- ppo
metrics:
- sharpe_ratio
- win_rate
pipeline_tag: reinforcement-learning
datasets:
- ZombitX64/xauusd-gold-price-historical-data-2004-2025
---
# PPO Model for XAUUSD Gold Trading
This repository contains a Reinforcement Learning model trained using Proximal Policy Optimization (PPO) for trading XAUUSD (Gold vs US Dollar) on 15-minute timeframes.
## Model Details
- **Model Type**: PPO (Proximal Policy Optimization)
- **Framework**: Stable-Baselines3
- **Environment**: Custom Gym environment for XAUUSD trading
- **Training Data**: Historical XAUUSD data from 2004 to 2025 (resampled to 15-min bars)
- **Total Timesteps**: 1,000,000
- **Position Sizing**: Base 5.0 oz, Max 7.5 oz
- **Initial Capital**: 200 USD
- **Transaction Cost**: 0.65 USD per oz
## Performance Metrics (Test Set)
- **Average Daily Profit**: 51.46 USD
- **Win Rate**: 69.0%
- **Max Drawdown**: 12.0%
- **Sharpe Ratio**: 7.56
- **Average Trades per Day**: 2.66
## Features Used
- Log Return
- RSI (14-period)
- Moving Averages (short/long)
- Bollinger Bands
- MACD
- Volume indicators
## Source Code
- GitHub: https://github.com/JonusNattapong/Reinforcement-Learning-for-Gold-Trading
## Usage
### Loading the Model
Below are two safe ways to load the trained policy depending on what you have available.
Option A — Load the full Stable-Baselines3 model (.zip)
```python
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecNormalize
import os
# Create or reconstruct an environment similar to the one used for training
# e.g. `env = make_your_env(...)` — replace with your env factory
env = ...
# If you saved VecNormalize separately, load and wrap your env first
if os.path.exists("models/vecnormalize.pkl"):
vec = VecNormalize.load("models/vecnormalize.pkl", env)
vec.training = False
vec.norm_reward = False
env = vec
# Load the full model (policy + optimizer state)
model = PPO.load("models/ppo_xauusd.zip", env=env)
```
Option B — Load weights saved as SafeTensors into a fresh PPO policy
```python
from safetensors.torch import load_file
import torch
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import VecNormalize
import os
# Create or reconstruct the same environment used for training
env = ...
# If you have VecNormalize statistics, load them and wrap the env
if os.path.exists("models/vecnormalize.pkl"):
vec = VecNormalize.load("models/vecnormalize.pkl", env)
vec.training = False
vec.norm_reward = False
env = vec
# Instantiate a PPO model with the same policy architecture
model = PPO("MlpPolicy", env)
# Load SafeTensors state dict and convert values to torch.Tensor if needed
raw_state = load_file("models/ppo_xauusd.safetensors")
state_dict = {k: (torch.tensor(v) if not isinstance(v, torch.Tensor) else v) for k, v in raw_state.items()}
# Load weights into the policy
model.policy.load_state_dict(state_dict)
# Ensure the model has the same env wrapper
model.set_env(env)
```
Notes:
- Option A is preferred when `ppo_xauusd.zip` is available (it contains the entire SB3 model).
- Option B is useful when only the policy weights were exported as SafeTensors. Ensure the policy architecture and observation/action spaces match the original training setup.
- Always set `vec.training = False` and `vec.norm_reward = False` when running inference.
### For Full Inference
To use the model for trading, you'll need to:
1. Set up the trading environment (`XAUUSDTradingEnv`)
2. Load VecNormalize stats
3. Run predictions
Note: This is a simulation model. Use with caution in real trading.
## Training Configuration
- Learning Rate: 0.0003
- Batch Size: 256
- Gamma: 0.99
- GAE Lambda: 0.95
- Clip Range: 0.2
- Entropy Coefficient: 0.01
## Files
- `ppo_xauusd.safetensors`: Model weights in SafeTensors format
- `vecnormalize.pkl`: VecNormalize statistics for observation normalization
## License
MIT License
## Disclaimer
This model is for educational and research purposes only. Trading involves risk, and past performance does not guarantee future results. Always backtest and validate before using in live trading. |