Spaces:

MERaLiON
/

SeaEval_Leaderboard

Running

binwang commited on Apr 12, 2024

Commit

e90e78a

1 Parent(s): 2de9386

new

Files changed (1) hide show

app.py CHANGED Viewed

@@ -2209,9 +2209,6 @@ with block:
     - **Number of Models**: {NUM_MODELS}
     - **Mode of Evaluation**: Zero-Shot, Five-Shot
-    ### Possible Issues:
-    - For base models, the output of base model is not truncated as no EOS detected. Evaluation could be affected, especially with length-aware metrics.
     ### The following table shows the performance of the models on the SeaEval benchmark.
     - For **Zero-shot** performance, it is the median value from 5 distinct prompts shown on the above leaderboard to mitigate the influence of random variations induced by prompts.
     - (-1) value indicates the results are ready yet.

     - **Number of Models**: {NUM_MODELS}
     - **Mode of Evaluation**: Zero-Shot, Five-Shot
     ### The following table shows the performance of the models on the SeaEval benchmark.
     - For **Zero-shot** performance, it is the median value from 5 distinct prompts shown on the above leaderboard to mitigate the influence of random variations induced by prompts.
     - (-1) value indicates the results are ready yet.