Update README.md
Browse files
README.md
CHANGED
|
@@ -29,7 +29,16 @@ language:
|
|
| 29 |
|
| 30 |
# SP3F-7B
|
| 31 |
|
| 32 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
### Citation
|
| 35 |
|
|
|
|
| 29 |
|
| 30 |
# SP3F-7B
|
| 31 |
|
| 32 |
+
SP3F-7B is a multilingual model trained with Self-Play with Privileged Pairwise Feedback, we use Qwen2.5-7B as our base.
|
| 33 |
+
|
| 34 |
+
| Model | Overall Acc | Overall Lang | MGSM Acc | MGSM Lang | MT Math100 Acc | MT Math100 Lang | Belebele Acc | Belebele Lang | Global MMLU Lite Acc | Global MMLU Lite Lang |
|
| 35 |
+
|:------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|
| 36 |
+
| Qwen2.5-7B | 14.79 | 78.78 | 22.15 | 90.67 | 21.16 | 58.22 | 7.52 | 80.39 | 8.34 | 85.85 |
|
| 37 |
+
| + SFT | 21.70 | 82.11 | 33.66 | 91.37 | 26.72 | 58.26 | 12.94 | 89.18 | 13.48 | 89.62 |
|
| 38 |
+
| + RLVR | <u>57.79</u> | **96.09** | 65.34 | **99.75** | 44.50 | **86.10** | **68.18** | <u>98.73</u> | <u>53.15</u> | **99.78** |
|
| 39 |
+
| **SP3F-7B** | **61.91** | <u>95.35</u> | **72.50** | <u>99.38</u> | <u>56.84</u> | <u>82.93</u> | <u>67.54</u> | **99.65** | 50.76 | <u>99.45</u> |
|
| 40 |
+
| Qwen2.5-7B-Instruct | 55.87 | 89.21 | <u>66.36</u> | 98.38 | 52.12 | 65.66 | 56.79 | 96.59 | 48.20 | 96.21 |
|
| 41 |
+
| + Translate Test | 57.01 | 85.98 | 66.15 | 95.81 | **60.08** | 59.34 | 48.09 | 92.27 | **53.73** | 96.49 |
|
| 42 |
|
| 43 |
### Citation
|
| 44 |
|