Spaces:
Running
Running
update a bit one
Browse files
app.py
CHANGED
|
@@ -2211,8 +2211,16 @@ with block:
|
|
| 2211 |
- **Number of Models**: {NUM_MODELS}
|
| 2212 |
- **Mode of Evaluation**: Zero-Shot, Five-Shot
|
| 2213 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2214 |
The following table shows the performance of the models on the SeaEval benchmark.
|
|
|
|
|
|
|
| 2215 |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
|
|
|
| 2216 |
""")
|
| 2217 |
|
| 2218 |
|
|
@@ -3221,13 +3229,7 @@ with block:
|
|
| 3221 |
|
| 3222 |
|
| 3223 |
gr.Markdown(r"""
|
| 3224 |
-
|
| 3225 |
-
- For **Zero-shot** performance, it is the median value from 5 distinct prompts shown on the above leaderboard to mitigate the influence of random variations induced by prompts.
|
| 3226 |
-
|
| 3227 |
-
- (-1) value indicates the results are ready yet and will be updated soon.
|
| 3228 |
-
|
| 3229 |
-
|
| 3230 |
-
|
| 3231 |
If our datasets and leaderboard are useful, please consider cite:
|
| 3232 |
```bibtex
|
| 3233 |
@article{SeaEval,
|
|
|
|
| 2211 |
- **Number of Models**: {NUM_MODELS}
|
| 2212 |
- **Mode of Evaluation**: Zero-Shot, Five-Shot
|
| 2213 |
|
| 2214 |
+
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
| 2215 |
+
Know Issues:
|
| 2216 |
+
- For base models, the output of base model is not truncated as no EOS detected. Evaluation could be affected, especially with length-aware metrics.
|
| 2217 |
+
|
| 2218 |
+
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
| 2219 |
The following table shows the performance of the models on the SeaEval benchmark.
|
| 2220 |
+
- For **Zero-shot** performance, it is the median value from 5 distinct prompts shown on the above leaderboard to mitigate the influence of random variations induced by prompts.
|
| 2221 |
+
- (-1) value indicates the results are ready yet.
|
| 2222 |
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
|
| 2223 |
+
|
| 2224 |
""")
|
| 2225 |
|
| 2226 |
|
|
|
|
| 3229 |
|
| 3230 |
|
| 3231 |
gr.Markdown(r"""
|
| 3232 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3233 |
If our datasets and leaderboard are useful, please consider cite:
|
| 3234 |
```bibtex
|
| 3235 |
@article{SeaEval,
|