Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,7 @@ base_model: stabilityai/stablelm-3b-4e1t
|
|
| 22 |
|
| 23 |
|
| 24 |
## Performance
|
| 25 |
-
Despite its compact dimensions, the model achieves outstanding scores in both
|
| 26 |
|
| 27 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
| 28 |
|-------------|-----|----|---------------|--------------|
|
|
@@ -63,18 +63,17 @@ In AlpacaEval, Rocket 🦝 achieves a near 80% win rate, coupled with an average
|
|
| 63 |
| **Rocket** 🦝 | **79.75** | **1.42** | **1242** |
|
| 64 |
|
| 65 |
|
| 66 |
-
##
|
| 67 |
|
| 68 |
| Metric | Value |
|
| 69 |
|-----------------------|---------------------------|
|
| 70 |
-
| Average |
|
| 71 |
-
| ARC
|
| 72 |
-
| HellaSwag
|
| 73 |
-
| MMLU
|
| 74 |
-
| TruthfulQA
|
| 75 |
-
| Winogrande
|
| 76 |
-
| GSM8K
|
| 77 |
-
| DROP (3-shot) | 24.49 |
|
| 78 |
|
| 79 |
|
| 80 |
## Intended uses & limitations
|
|
|
|
| 22 |
|
| 23 |
|
| 24 |
## Performance
|
| 25 |
+
Despite its compact dimensions, the model achieves outstanding scores in both [MT-Bench](https://huggingface.co/spaces/lmsys/mt-bench) and [AlpacaEval](https://tatsu-lab.github.io/alpaca_eval/) benchmarks, surpassing the performance of considerably larger models.
|
| 26 |
|
| 27 |
| Model | Size | Alignment | MT-Bench (score) | AlpacaEval (win rate %) |
|
| 28 |
|-------------|-----|----|---------------|--------------|
|
|
|
|
| 63 |
| **Rocket** 🦝 | **79.75** | **1.42** | **1242** |
|
| 64 |
|
| 65 |
|
| 66 |
+
## Open LLM leaderboard
|
| 67 |
|
| 68 |
| Metric | Value |
|
| 69 |
|-----------------------|---------------------------|
|
| 70 |
+
| Average | 55.77 |
|
| 71 |
+
| ARC | 50.6 |
|
| 72 |
+
| HellaSwag | 76.69 |
|
| 73 |
+
| MMLU | 47.1 |
|
| 74 |
+
| TruthfulQA | 55.82 |
|
| 75 |
+
| Winogrande | 67.96 |
|
| 76 |
+
| GSM8K | 36.47 |
|
|
|
|
| 77 |
|
| 78 |
|
| 79 |
## Intended uses & limitations
|