Adding Evaluation Results
#9
by
pansophic
- opened
README.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
| 1 |
---
|
| 2 |
-
model-index:
|
| 3 |
-
- name: rocket-3b
|
| 4 |
-
results: []
|
| 5 |
-
license: cc-by-sa-4.0
|
| 6 |
language:
|
| 7 |
- en
|
|
|
|
| 8 |
base_model: stabilityai/stablelm-3b-4e1t
|
|
|
|
|
|
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/6501bfe0493fd9c8c2e32402/BmbkjOkcTm-YMa-unolmJ.png" alt="Rocket Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
|
@@ -128,4 +128,17 @@ The pretraining dataset is comprised of a filtered mixture of open-source large-
|
|
| 128 |
|
| 129 |
**The model name is inspired by the small but formidable character from 'Guardians of the Galaxy'. Similar to its namesake, this model, with its 3 billion parameters, showcases remarkable efficiency and effectiveness, challenging larger models despite its smaller size."*
|
| 130 |
|
| 131 |
-
*Model card adapted from [Zephyr Beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/README.md) and [Tulu-2-7B](https://huggingface.co/allenai/tulu-2-7b/blob/main/README.md)*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
+
license: cc-by-sa-4.0
|
| 5 |
base_model: stabilityai/stablelm-3b-4e1t
|
| 6 |
+
model-index:
|
| 7 |
+
- name: rocket-3b
|
| 8 |
+
results: []
|
| 9 |
---
|
| 10 |
|
| 11 |
<img src="https://cdn-uploads.huggingface.co/production/uploads/6501bfe0493fd9c8c2e32402/BmbkjOkcTm-YMa-unolmJ.png" alt="Rocket Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
|
|
|
|
| 128 |
|
| 129 |
**The model name is inspired by the small but formidable character from 'Guardians of the Galaxy'. Similar to its namesake, this model, with its 3 billion parameters, showcases remarkable efficiency and effectiveness, challenging larger models despite its smaller size."*
|
| 130 |
|
| 131 |
+
*Model card adapted from [Zephyr Beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/README.md) and [Tulu-2-7B](https://huggingface.co/allenai/tulu-2-7b/blob/main/README.md)*
|
| 132 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
| 133 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_pansophic__rocket-3B)
|
| 134 |
+
|
| 135 |
+
| Metric |Value|
|
| 136 |
+
|---------------------------------|----:|
|
| 137 |
+
|Avg. |55.77|
|
| 138 |
+
|AI2 Reasoning Challenge (25-Shot)|50.60|
|
| 139 |
+
|HellaSwag (10-Shot) |76.69|
|
| 140 |
+
|MMLU (5-Shot) |47.10|
|
| 141 |
+
|TruthfulQA (0-shot) |55.82|
|
| 142 |
+
|Winogrande (5-shot) |67.96|
|
| 143 |
+
|GSM8k (5-shot) |36.47|
|
| 144 |
+
|