Update README.md
Browse files
README.md
CHANGED
|
@@ -83,7 +83,7 @@ RRMs have been extensively evaluated on several benchmarks.
|
|
| 83 |
|
| 84 |
| Model | MMLU-Pro | MATH | GPQA | Overall |
|
| 85 |
|----------------------------|----------|---------|---------|---------|
|
| 86 |
-
| Skywork-Reward-Gemma-2-27B
|
| 87 |
| J1-Llama-8B (SC@32) | 67.5 | 76.6 | 55.7 | 66.7 |
|
| 88 |
| J1-Llama-70B (SC@32) | 79.9 | 88.1 | 66.5 | 78.2 |
|
| 89 |
| DeepSeek-GRM-27B (MetaRM) (voting@32) | 68.1 | 70.0 | 56.9 | 65.0 |
|
|
|
|
| 83 |
|
| 84 |
| Model | MMLU-Pro | MATH | GPQA | Overall |
|
| 85 |
|----------------------------|----------|---------|---------|---------|
|
| 86 |
+
| Skywork-Reward-Gemma-2-27B | 55.0 | 46.2 | 44.7 | 48.6 |
|
| 87 |
| J1-Llama-8B (SC@32) | 67.5 | 76.6 | 55.7 | 66.7 |
|
| 88 |
| J1-Llama-70B (SC@32) | 79.9 | 88.1 | 66.5 | 78.2 |
|
| 89 |
| DeepSeek-GRM-27B (MetaRM) (voting@32) | 68.1 | 70.0 | 56.9 | 65.0 |
|