Update README.md
Browse files
README.md
CHANGED
|
@@ -129,6 +129,7 @@ Note:
|
|
| 129 |
| | MT-Eval | 8.13 | 7.36 | 6.7 | 8.18 | 8.45 | 8.12 | - |
|
| 130 |
| | AlignBench v1.1 | 7 | 6.13 | 5.99 | 6.95 | 6.3 | 6.33 | 7.06 |
|
| 131 |
| | Average | 53.74 | - | 46.54 | 52.61 | 50.54 | 48.95 | - |
|
|
|
|
| 132 |
Note:
|
| 133 |
1. For InternLM3-8B-Instruct, the results marked with `*` are sourced from their official website, other evaluations are conducted based on internal evaluation frameworks.
|
| 134 |
2. For Multi-IF, we report the overall average computed across all three rounds, pooling the Chinese and English metrics.
|
|
|
|
| 129 |
| | MT-Eval | 8.13 | 7.36 | 6.7 | 8.18 | 8.45 | 8.12 | - |
|
| 130 |
| | AlignBench v1.1 | 7 | 6.13 | 5.99 | 6.95 | 6.3 | 6.33 | 7.06 |
|
| 131 |
| | Average | 53.74 | - | 46.54 | 52.61 | 50.54 | 48.95 | - |
|
| 132 |
+
|
| 133 |
Note:
|
| 134 |
1. For InternLM3-8B-Instruct, the results marked with `*` are sourced from their official website, other evaluations are conducted based on internal evaluation frameworks.
|
| 135 |
2. For Multi-IF, we report the overall average computed across all three rounds, pooling the Chinese and English metrics.
|