Update README.md
Browse files
README.md
CHANGED
|
@@ -10,12 +10,12 @@ license: apache-2.0
|
|
| 10 |
<a href="https://infix-ai.com/research/infir2/">🌐 Project Website</a>
|
| 11 |
</p>
|
| 12 |
|
| 13 |
-
We performed **Reinforcement Learning (RL)** on the **InfiR2-7B-Instruct-FP8** model using the **dapo-math-17k** and the **FP8 format
|
| 14 |
|
| 15 |
<div align="center">
|
| 16 |
|
| 17 |
|
| 18 |
-
| Parameter | Value |
|
| 19 |
| :---: | :---: |
|
| 20 |
| **Batch Size** | 128 |
|
| 21 |
| **N Samples Per Prompt** | 16 |
|
|
@@ -85,12 +85,19 @@ Below is the performance comparison of **InfiR2-R1-7B-FP8** on reasoning benchma
|
|
| 85 |
<td align="center">39.48</td>
|
| 86 |
</tr>
|
| 87 |
<tr>
|
| 88 |
-
<td align="left"><strong>InfiR2-
|
| 89 |
<td align="center">40.62</td>
|
| 90 |
<td align="center">55.73</td>
|
| 91 |
<td align="center">45.33</td>
|
| 92 |
<td align="center">40.31</td>
|
| 93 |
</tr>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
</tr>
|
| 95 |
</tbody>
|
| 96 |
</table>
|
|
|
|
| 10 |
<a href="https://infix-ai.com/research/infir2/">🌐 Project Website</a>
|
| 11 |
</p>
|
| 12 |
|
| 13 |
+
We performed **Reinforcement Learning (RL)** on the **InfiR2-7B-Instruct-FP8** model using the **dapo-math-17k** and the **FP8 format** (inference), with hyperparameters shown below.
|
| 14 |
|
| 15 |
<div align="center">
|
| 16 |
|
| 17 |
|
| 18 |
+
| Parameter(stage2) | Value |
|
| 19 |
| :---: | :---: |
|
| 20 |
| **Batch Size** | 128 |
|
| 21 |
| **N Samples Per Prompt** | 16 |
|
|
|
|
| 85 |
<td align="center">39.48</td>
|
| 86 |
</tr>
|
| 87 |
<tr>
|
| 88 |
+
<td align="left"><strong>InfiR2-7B-Instruct-FP8</strong></td>
|
| 89 |
<td align="center">40.62</td>
|
| 90 |
<td align="center">55.73</td>
|
| 91 |
<td align="center">45.33</td>
|
| 92 |
<td align="center">40.31</td>
|
| 93 |
</tr>
|
| 94 |
+
<tr>
|
| 95 |
+
<td align="left"><strong>InfiR2-R1-7B-FP8</strong></td>
|
| 96 |
+
<td align="center"><strong>53.64</strong></td>
|
| 97 |
+
<td align="center"><strong>60.62</strong></td>
|
| 98 |
+
<td align="center"><strong>49.18</strong></td>
|
| 99 |
+
<td align="center">39.36</td>
|
| 100 |
+
</tr>
|
| 101 |
</tr>
|
| 102 |
</tbody>
|
| 103 |
</table>
|