Text Generation
Safetensors
qwen2
conversational
lintang commited on
Commit
46c9a61
·
verified ·
1 Parent(s): e34e4d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -29,7 +29,16 @@ language:
29
 
30
  # SP3F-7B
31
 
32
-
 
 
 
 
 
 
 
 
 
33
 
34
  ### Citation
35
 
 
29
 
30
  # SP3F-7B
31
 
32
+ SP3F-7B is a multilingual model trained with Self-Play with Privileged Pairwise Feedback, we use Qwen2.5-7B as our base.
33
+
34
+ | Model | Overall Acc | Overall Lang | MGSM Acc | MGSM Lang | MT Math100 Acc | MT Math100 Lang | Belebele Acc | Belebele Lang | Global MMLU Lite Acc | Global MMLU Lite Lang |
35
+ |:------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
36
+ | Qwen2.5-7B | 14.79 | 78.78 | 22.15 | 90.67 | 21.16 | 58.22 | 7.52 | 80.39 | 8.34 | 85.85 |
37
+ |     + SFT | 21.70 | 82.11 | 33.66 | 91.37 | 26.72 | 58.26 | 12.94 | 89.18 | 13.48 | 89.62 |
38
+ | &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;+ RLVR | <u>57.79</u> | **96.09** | 65.34 | **99.75** | 44.50 | **86.10** | **68.18** | <u>98.73</u> | <u>53.15</u> | **99.78** |
39
+ | **SP3F-7B** | **61.91** | <u>95.35</u> | **72.50** | <u>99.38</u> | <u>56.84</u> | <u>82.93</u> | <u>67.54</u> | **99.65** | 50.76 | <u>99.45</u> |
40
+ | Qwen2.5-7B-Instruct | 55.87 | 89.21 | <u>66.36</u> | 98.38 | 52.12 | 65.66 | 56.79 | 96.59 | 48.20 | 96.21 |
41
+ | &nbsp;&nbsp;&nbsp;&nbsp;+ Translate Test | 57.01 | 85.98 | 66.15 | 95.81 | **60.08** | 59.34 | 48.09 | 92.27 | **53.73** | 96.49 |
42
 
43
  ### Citation
44