Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -59,7 +59,7 @@ We evaluate our model on two challenging reward benchmarks, [RM-Bench](https://g
|
|
| 59 |
|
| 60 |
- Results on the JudgeBench.
|
| 61 |
|
| 62 |
-
| **Model** | **Params.** | **
|
| 63 |
|:-|-:|:-:|:-:|:-:|:-:|:-:|
|
| 64 |
|**LLM-as-a-Judge**||||||
|
| 65 |
|GPT-4o |- |50.6 | 54.1 | 75.0 | 59.5 | 59.8 |
|
|
|
|
| 59 |
|
| 60 |
- Results on the JudgeBench.
|
| 61 |
|
| 62 |
+
| **Model** | **Params.** | **Knowl.** | **Reason.** | **Math** | **Coding** | **Overall** |
|
| 63 |
|:-|-:|:-:|:-:|:-:|:-:|:-:|
|
| 64 |
|**LLM-as-a-Judge**||||||
|
| 65 |
|GPT-4o |- |50.6 | 54.1 | 75.0 | 59.5 | 59.8 |
|