Commit
·
e987f7e
1
Parent(s):
0ad4395
Update README.md
Browse files
README.md
CHANGED
|
@@ -22,14 +22,17 @@ We evaluated model_009 on a wide range of tasks using [Language Model Evaluation
|
|
| 22 |
|
| 23 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 24 |
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|**Task**|**
|
| 28 |
-
|*
|
| 29 |
-
|*
|
| 30 |
-
|*
|
| 31 |
-
|*
|
| 32 |
-
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
|
| 35 |
## Example Usage
|
|
|
|
| 22 |
|
| 23 |
Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 24 |
|
| 25 |
+
|||
|
| 26 |
+
|:------:|:-------:|
|
| 27 |
+
|**Task**|**Value**|
|
| 28 |
+
|*ARC*|0.7159|
|
| 29 |
+
|*HellaSwag*|0.8771|
|
| 30 |
+
|*MMLU*|0.6943|
|
| 31 |
+
|*TruthfulQA*|0.6072|
|
| 32 |
+
|*Winogrande*|0.8232|
|
| 33 |
+
|*GSM8k*|0.3942|
|
| 34 |
+
|*DROP*|0.4401|
|
| 35 |
+
|**Total Average**|**0.6503**|
|
| 36 |
|
| 37 |
|
| 38 |
## Example Usage
|