Haoxiang-Wang commited on
Commit
1b7ed31
·
verified ·
1 Parent(s): f8b463e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -133,7 +133,7 @@ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated sys
133
 
134
  ## Training Method
135
 
136
- ![NFT Method](../assets/method_NFT.jpg)
137
 
138
  The NFT training pipeline consists of three main components:
139
 
@@ -147,7 +147,7 @@ The NFT training pipeline consists of three main components:
147
  L_NFT(θ) = r[-log(π_θ⁺(a|q) / π(a|q))] + (1-r)[-log((1 - r_q * (π_θ⁺(a|q) / π(a|q))) / (1-r_q))]
148
  ```
149
 
150
- ![Policy Distribution](../assets/distribution_NFT.jpg)
151
 
152
  ## Training Datasets
153
 
@@ -176,7 +176,7 @@ NFT-32B is evaluated on 6 mathematical reasoning benchmarks:
176
 
177
  ## Performance
178
 
179
- ![Performance Comparison](../assets/main_compare_NFT.jpg)
180
 
181
  NFT-32B achieves state-of-the-art performance among supervised learning methods for mathematical reasoning:
182
 
@@ -192,7 +192,7 @@ NFT-32B achieves state-of-the-art performance among supervised learning methods
192
 
193
  Notably, NFT-32B performs similarly to DAPO (59.2% vs 59.9%) while using a simpler supervised learning approach.
194
 
195
- ![Validation Curves](../assets/val_acc_curve_NFT.jpg)
196
 
197
  ## Usage
198
 
 
133
 
134
  ## Training Method
135
 
136
+ ![NFT Method](./assets/method_NFT.jpg)
137
 
138
  The NFT training pipeline consists of three main components:
139
 
 
147
  L_NFT(θ) = r[-log(π_θ⁺(a|q) / π(a|q))] + (1-r)[-log((1 - r_q * (π_θ⁺(a|q) / π(a|q))) / (1-r_q))]
148
  ```
149
 
150
+ ![Policy Distribution](./assets/distribution_NFT.jpg)
151
 
152
  ## Training Datasets
153
 
 
176
 
177
  ## Performance
178
 
179
+ ![Performance Comparison](./assets/main_compare_NFT.jpg)
180
 
181
  NFT-32B achieves state-of-the-art performance among supervised learning methods for mathematical reasoning:
182
 
 
192
 
193
  Notably, NFT-32B performs similarly to DAPO (59.2% vs 59.9%) while using a simpler supervised learning approach.
194
 
195
+ ![Validation Curves](./assets/val_acc_curve_NFT.jpg)
196
 
197
  ## Usage
198