JonusNattapong commited on
Commit
a6f1a54
·
verified ·
1 Parent(s): 610e83c

Add comprehensive README with model details, metrics, and usage instructions

Browse files
Files changed (1) hide show
  1. README.md +56 -7
README.md CHANGED
@@ -50,20 +50,69 @@ This repository contains a Reinforcement Learning model trained using Proximal P
50
 
51
  ### Loading the Model
52
 
 
 
 
 
53
  ```python
54
- from safetensors.torch import load_file
55
  from stable_baselines3 import PPO
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
  import torch
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
- # Load state dict from safetensors
59
- state_dict = load_file("ppo_xauusd.safetensors")
60
- policy = PPO.policy_class(observation_space, action_space) # Define spaces accordingly
61
- policy.load_state_dict(state_dict)
62
 
63
- # Create model
64
- model = PPO(policy=policy, env=env) # Or load full model if available
 
 
 
65
  ```
66
 
 
 
 
 
 
 
67
  ### For Full Inference
68
 
69
  To use the model for trading, you'll need to:
 
50
 
51
  ### Loading the Model
52
 
53
+ Below are two safe ways to load the trained policy depending on what you have available.
54
+
55
+ Option A — Load the full Stable-Baselines3 model (.zip)
56
+
57
  ```python
 
58
  from stable_baselines3 import PPO
59
+ from stable_baselines3.common.vec_env import VecNormalize
60
+ import os
61
+
62
+ # Create or reconstruct an environment similar to the one used for training
63
+ # e.g. `env = make_your_env(...)` — replace with your env factory
64
+ env = ...
65
+
66
+ # If you saved VecNormalize separately, load and wrap your env first
67
+ if os.path.exists("models/vecnormalize.pkl"):
68
+ vec = VecNormalize.load("models/vecnormalize.pkl", env)
69
+ vec.training = False
70
+ vec.norm_reward = False
71
+ env = vec
72
+
73
+ # Load the full model (policy + optimizer state)
74
+ model = PPO.load("models/ppo_xauusd.zip", env=env)
75
+ ```
76
+
77
+ Option B — Load weights saved as SafeTensors into a fresh PPO policy
78
+
79
+ ```python
80
+ from safetensors.torch import load_file
81
  import torch
82
+ from stable_baselines3 import PPO
83
+ from stable_baselines3.common.vec_env import VecNormalize
84
+ import os
85
+
86
+ # Create or reconstruct the same environment used for training
87
+ env = ...
88
+
89
+ # If you have VecNormalize statistics, load them and wrap the env
90
+ if os.path.exists("models/vecnormalize.pkl"):
91
+ vec = VecNormalize.load("models/vecnormalize.pkl", env)
92
+ vec.training = False
93
+ vec.norm_reward = False
94
+ env = vec
95
+
96
+ # Instantiate a PPO model with the same policy architecture
97
+ model = PPO("MlpPolicy", env)
98
 
99
+ # Load SafeTensors state dict and convert values to torch.Tensor if needed
100
+ raw_state = load_file("models/ppo_xauusd.safetensors")
101
+ state_dict = {k: (torch.tensor(v) if not isinstance(v, torch.Tensor) else v) for k, v in raw_state.items()}
 
102
 
103
+ # Load weights into the policy
104
+ model.policy.load_state_dict(state_dict)
105
+
106
+ # Ensure the model has the same env wrapper
107
+ model.set_env(env)
108
  ```
109
 
110
+ Notes:
111
+ - Option A is preferred when `ppo_xauusd.zip` is available (it contains the entire SB3 model).
112
+ - Option B is useful when only the policy weights were exported as SafeTensors. Ensure the policy architecture and observation/action spaces match the original training setup.
113
+ - Always set `vec.training = False` and `vec.norm_reward = False` when running inference.
114
+
115
+
116
  ### For Full Inference
117
 
118
  To use the model for trading, you'll need to: