Safetensors
qwen2
fp8
baicaihaochi121 commited on
Commit
552f334
·
verified ·
1 Parent(s): 5d1a0c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -3
README.md CHANGED
@@ -10,12 +10,12 @@ license: apache-2.0
10
    <a href="https://infix-ai.com/research/infir2/">🌐 Project Website</a> &nbsp;
11
  </p>
12
 
13
- We performed **Reinforcement Learning (RL)** on the **InfiR2-7B-Instruct-FP8** model using the **dapo-math-17k** and the **FP8 format**, with hyperparameters shown below.
14
 
15
  <div align="center">
16
 
17
 
18
- | Parameter | Value |
19
  | :---: | :---: |
20
  | **Batch Size** | 128 |
21
  | **N Samples Per Prompt** | 16 |
@@ -85,12 +85,19 @@ Below is the performance comparison of **InfiR2-R1-7B-FP8** on reasoning benchma
85
  <td align="center">39.48</td>
86
  </tr>
87
  <tr>
88
- <td align="left"><strong>InfiR2-R1-7B-FP8</strong></td>
89
  <td align="center">40.62</td>
90
  <td align="center">55.73</td>
91
  <td align="center">45.33</td>
92
  <td align="center">40.31</td>
93
  </tr>
 
 
 
 
 
 
 
94
  </tr>
95
  </tbody>
96
  </table>
 
10
    <a href="https://infix-ai.com/research/infir2/">🌐 Project Website</a> &nbsp;
11
  </p>
12
 
13
+ We performed **Reinforcement Learning (RL)** on the **InfiR2-7B-Instruct-FP8** model using the **dapo-math-17k** and the **FP8 format** (inference), with hyperparameters shown below.
14
 
15
  <div align="center">
16
 
17
 
18
+ | Parameter(stage2) | Value |
19
  | :---: | :---: |
20
  | **Batch Size** | 128 |
21
  | **N Samples Per Prompt** | 16 |
 
85
  <td align="center">39.48</td>
86
  </tr>
87
  <tr>
88
+ <td align="left"><strong>InfiR2-7B-Instruct-FP8</strong></td>
89
  <td align="center">40.62</td>
90
  <td align="center">55.73</td>
91
  <td align="center">45.33</td>
92
  <td align="center">40.31</td>
93
  </tr>
94
+ <tr>
95
+ <td align="left"><strong>InfiR2-R1-7B-FP8</strong></td>
96
+ <td align="center"><strong>53.64</strong></td>
97
+ <td align="center"><strong>60.62</strong></td>
98
+ <td align="center"><strong>49.18</strong></td>
99
+ <td align="center">39.36</td>
100
+ </tr>
101
  </tr>
102
  </tbody>
103
  </table>