JonusNattapong commited on
Commit
529bf3b
·
verified ·
1 Parent(s): 1c1d5eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +146 -144
README.md CHANGED
@@ -1,145 +1,147 @@
1
- ---
2
- license: mit
3
- language: en
4
- library_name: stable-baselines3
5
- tags:
6
- - reinforcement-learning
7
- - finance
8
- - gold-trading
9
- - xauusd
10
- - ppo
11
- metrics:
12
- - sharpe_ratio
13
- - win_rate
14
- pipeline_tag: reinforcement-learning
15
- ---
16
-
17
- # PPO Model for XAUUSD Gold Trading
18
-
19
- This repository contains a Reinforcement Learning model trained using Proximal Policy Optimization (PPO) for trading XAUUSD (Gold vs US Dollar) on 15-minute timeframes.
20
-
21
- ## Model Details
22
-
23
- - **Model Type**: PPO (Proximal Policy Optimization)
24
- - **Framework**: Stable-Baselines3
25
- - **Environment**: Custom Gym environment for XAUUSD trading
26
- - **Training Data**: Historical XAUUSD data from 2004 to 2025 (resampled to 15-min bars)
27
- - **Total Timesteps**: 1,000,000
28
- - **Position Sizing**: Base 5.0 oz, Max 7.5 oz
29
- - **Initial Capital**: 200 USD
30
- - **Transaction Cost**: 0.65 USD per oz
31
-
32
- ## Performance Metrics (Test Set)
33
-
34
- - **Average Daily Profit**: 51.46 USD
35
- - **Win Rate**: 69.0%
36
- - **Max Drawdown**: 12.0%
37
- - **Sharpe Ratio**: 7.56
38
- - **Average Trades per Day**: 2.66
39
-
40
- ## Features Used
41
-
42
- - Log Return
43
- - RSI (14-period)
44
- - Moving Averages (short/long)
45
- - Bollinger Bands
46
- - MACD
47
- - Volume indicators
48
-
49
- ## Usage
50
-
51
- ### Loading the Model
52
-
53
- Below are two safe ways to load the trained policy depending on what you have available.
54
-
55
- Option A Load the full Stable-Baselines3 model (.zip)
56
-
57
- ```python
58
- from stable_baselines3 import PPO
59
- from stable_baselines3.common.vec_env import VecNormalize
60
- import os
61
-
62
- # Create or reconstruct an environment similar to the one used for training
63
- # e.g. `env = make_your_env(...)` — replace with your env factory
64
- env = ...
65
-
66
- # If you saved VecNormalize separately, load and wrap your env first
67
- if os.path.exists("models/vecnormalize.pkl"):
68
- vec = VecNormalize.load("models/vecnormalize.pkl", env)
69
- vec.training = False
70
- vec.norm_reward = False
71
- env = vec
72
-
73
- # Load the full model (policy + optimizer state)
74
- model = PPO.load("models/ppo_xauusd.zip", env=env)
75
- ```
76
-
77
- Option B — Load weights saved as SafeTensors into a fresh PPO policy
78
-
79
- ```python
80
- from safetensors.torch import load_file
81
- import torch
82
- from stable_baselines3 import PPO
83
- from stable_baselines3.common.vec_env import VecNormalize
84
- import os
85
-
86
- # Create or reconstruct the same environment used for training
87
- env = ...
88
-
89
- # If you have VecNormalize statistics, load them and wrap the env
90
- if os.path.exists("models/vecnormalize.pkl"):
91
- vec = VecNormalize.load("models/vecnormalize.pkl", env)
92
- vec.training = False
93
- vec.norm_reward = False
94
- env = vec
95
-
96
- # Instantiate a PPO model with the same policy architecture
97
- model = PPO("MlpPolicy", env)
98
-
99
- # Load SafeTensors state dict and convert values to torch.Tensor if needed
100
- raw_state = load_file("models/ppo_xauusd.safetensors")
101
- state_dict = {k: (torch.tensor(v) if not isinstance(v, torch.Tensor) else v) for k, v in raw_state.items()}
102
-
103
- # Load weights into the policy
104
- model.policy.load_state_dict(state_dict)
105
-
106
- # Ensure the model has the same env wrapper
107
- model.set_env(env)
108
- ```
109
-
110
- Notes:
111
- - Option A is preferred when `ppo_xauusd.zip` is available (it contains the entire SB3 model).
112
- - Option B is useful when only the policy weights were exported as SafeTensors. Ensure the policy architecture and observation/action spaces match the original training setup.
113
- - Always set `vec.training = False` and `vec.norm_reward = False` when running inference.
114
-
115
-
116
- ### For Full Inference
117
-
118
- To use the model for trading, you'll need to:
119
- 1. Set up the trading environment (`XAUUSDTradingEnv`)
120
- 2. Load VecNormalize stats
121
- 3. Run predictions
122
-
123
- Note: This is a simulation model. Use with caution in real trading.
124
-
125
- ## Training Configuration
126
-
127
- - Learning Rate: 0.0003
128
- - Batch Size: 256
129
- - Gamma: 0.99
130
- - GAE Lambda: 0.95
131
- - Clip Range: 0.2
132
- - Entropy Coefficient: 0.01
133
-
134
- ## Files
135
-
136
- - `ppo_xauusd.safetensors`: Model weights in SafeTensors format
137
- - `vecnormalize.pkl`: VecNormalize statistics for observation normalization
138
-
139
- ## License
140
-
141
- MIT License
142
-
143
- ## Disclaimer
144
-
 
 
145
  This model is for educational and research purposes only. Trading involves risk, and past performance does not guarantee future results. Always backtest and validate before using in live trading.
 
1
+ ---
2
+ license: mit
3
+ language: en
4
+ library_name: stable-baselines3
5
+ tags:
6
+ - reinforcement-learning
7
+ - finance
8
+ - gold-trading
9
+ - xauusd
10
+ - ppo
11
+ metrics:
12
+ - sharpe_ratio
13
+ - win_rate
14
+ pipeline_tag: reinforcement-learning
15
+ datasets:
16
+ - ZombitX64/xauusd-gold-price-historical-data-2004-2025
17
+ ---
18
+
19
+ # PPO Model for XAUUSD Gold Trading
20
+
21
+ This repository contains a Reinforcement Learning model trained using Proximal Policy Optimization (PPO) for trading XAUUSD (Gold vs US Dollar) on 15-minute timeframes.
22
+
23
+ ## Model Details
24
+
25
+ - **Model Type**: PPO (Proximal Policy Optimization)
26
+ - **Framework**: Stable-Baselines3
27
+ - **Environment**: Custom Gym environment for XAUUSD trading
28
+ - **Training Data**: Historical XAUUSD data from 2004 to 2025 (resampled to 15-min bars)
29
+ - **Total Timesteps**: 1,000,000
30
+ - **Position Sizing**: Base 5.0 oz, Max 7.5 oz
31
+ - **Initial Capital**: 200 USD
32
+ - **Transaction Cost**: 0.65 USD per oz
33
+
34
+ ## Performance Metrics (Test Set)
35
+
36
+ - **Average Daily Profit**: 51.46 USD
37
+ - **Win Rate**: 69.0%
38
+ - **Max Drawdown**: 12.0%
39
+ - **Sharpe Ratio**: 7.56
40
+ - **Average Trades per Day**: 2.66
41
+
42
+ ## Features Used
43
+
44
+ - Log Return
45
+ - RSI (14-period)
46
+ - Moving Averages (short/long)
47
+ - Bollinger Bands
48
+ - MACD
49
+ - Volume indicators
50
+
51
+ ## Usage
52
+
53
+ ### Loading the Model
54
+
55
+ Below are two safe ways to load the trained policy depending on what you have available.
56
+
57
+ Option A — Load the full Stable-Baselines3 model (.zip)
58
+
59
+ ```python
60
+ from stable_baselines3 import PPO
61
+ from stable_baselines3.common.vec_env import VecNormalize
62
+ import os
63
+
64
+ # Create or reconstruct an environment similar to the one used for training
65
+ # e.g. `env = make_your_env(...)` — replace with your env factory
66
+ env = ...
67
+
68
+ # If you saved VecNormalize separately, load and wrap your env first
69
+ if os.path.exists("models/vecnormalize.pkl"):
70
+ vec = VecNormalize.load("models/vecnormalize.pkl", env)
71
+ vec.training = False
72
+ vec.norm_reward = False
73
+ env = vec
74
+
75
+ # Load the full model (policy + optimizer state)
76
+ model = PPO.load("models/ppo_xauusd.zip", env=env)
77
+ ```
78
+
79
+ Option B — Load weights saved as SafeTensors into a fresh PPO policy
80
+
81
+ ```python
82
+ from safetensors.torch import load_file
83
+ import torch
84
+ from stable_baselines3 import PPO
85
+ from stable_baselines3.common.vec_env import VecNormalize
86
+ import os
87
+
88
+ # Create or reconstruct the same environment used for training
89
+ env = ...
90
+
91
+ # If you have VecNormalize statistics, load them and wrap the env
92
+ if os.path.exists("models/vecnormalize.pkl"):
93
+ vec = VecNormalize.load("models/vecnormalize.pkl", env)
94
+ vec.training = False
95
+ vec.norm_reward = False
96
+ env = vec
97
+
98
+ # Instantiate a PPO model with the same policy architecture
99
+ model = PPO("MlpPolicy", env)
100
+
101
+ # Load SafeTensors state dict and convert values to torch.Tensor if needed
102
+ raw_state = load_file("models/ppo_xauusd.safetensors")
103
+ state_dict = {k: (torch.tensor(v) if not isinstance(v, torch.Tensor) else v) for k, v in raw_state.items()}
104
+
105
+ # Load weights into the policy
106
+ model.policy.load_state_dict(state_dict)
107
+
108
+ # Ensure the model has the same env wrapper
109
+ model.set_env(env)
110
+ ```
111
+
112
+ Notes:
113
+ - Option A is preferred when `ppo_xauusd.zip` is available (it contains the entire SB3 model).
114
+ - Option B is useful when only the policy weights were exported as SafeTensors. Ensure the policy architecture and observation/action spaces match the original training setup.
115
+ - Always set `vec.training = False` and `vec.norm_reward = False` when running inference.
116
+
117
+
118
+ ### For Full Inference
119
+
120
+ To use the model for trading, you'll need to:
121
+ 1. Set up the trading environment (`XAUUSDTradingEnv`)
122
+ 2. Load VecNormalize stats
123
+ 3. Run predictions
124
+
125
+ Note: This is a simulation model. Use with caution in real trading.
126
+
127
+ ## Training Configuration
128
+
129
+ - Learning Rate: 0.0003
130
+ - Batch Size: 256
131
+ - Gamma: 0.99
132
+ - GAE Lambda: 0.95
133
+ - Clip Range: 0.2
134
+ - Entropy Coefficient: 0.01
135
+
136
+ ## Files
137
+
138
+ - `ppo_xauusd.safetensors`: Model weights in SafeTensors format
139
+ - `vecnormalize.pkl`: VecNormalize statistics for observation normalization
140
+
141
+ ## License
142
+
143
+ MIT License
144
+
145
+ ## Disclaimer
146
+
147
  This model is for educational and research purposes only. Trading involves risk, and past performance does not guarantee future results. Always backtest and validate before using in live trading.