LLM360
/

CrystalChat

Text Generation

Eval Results (legacy)

Model card Files Files and versions

victormiller commited on Jun 17, 2024

Commit

fd74aa8

·

verified ·

1 Parent(s): b5333d3

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -201,6 +201,13 @@ As always, the training data, training code, and metrics are publicly available.
 | Llama-2-7b-Chat          | 2T             | 34.11        | 52.86         | 15.35       | 53.07 | 78.39     | 48.42         | 18.88 | 73.09              | 45.30      | 13.26              | 17.43         |
 | AmberChat 7B             | 1.25T          |     -        | 44.76         |     -       | 42.83 | 74.03     | 38.88         | 5.31  | 66.77              | 40.72      |     -              |       -       |
 | Combined Language and Coding Ability           |

 | Llama-2-7b-Chat          | 2T             | 34.11        | 52.86         | 15.35       | 53.07 | 78.39     | 48.42         | 18.88 | 73.09              | 45.30      | 13.26              | 17.43         |
 | AmberChat 7B             | 1.25T          |     -        | 44.76         |     -       | 42.83 | 74.03     | 38.88         | 5.31  | 66.77              | 40.72      |     -              |       -       |
+|           Model          | Trained Tokens |  ARC  | HellaSwag | MMLU (5-shot) | GSM8K | Winogrande(5-shot) | TruthfulQA | HumanEval (pass@1) | MBPP (pass@1) |
+|:------------------------:|:--------------:|:------------:|:-------------:|:-----------:|:-----:|:---------:|:-------------:|:-----:|:------------------:|
+| CrystalChat 7B           | 1.275T         | 51.71 | 76.12     | 53.22         | 28.05 | 70.64              | 47.29      | 34.12              | 39.11         |
+| Mistral-7B-Instruct-v0.1 | -              | 58.05 | 75.71     | 55.56         | 32.00 | 74.27              | 55.90      | 29.27              | 31.96         |
+| CodeLlama-7b-Instruct    | 2.5T           | 43.35 | 66.14     | 42.75         | 15.92 | 64.33              | 39.23      | 34.12              | 38.91         |
+| Llama-2-7b-Chat          | 2T             | 53.07 | 78.39     | 48.42         | 18.88 | 73.09              | 45.30      | 13.26              | 17.43         |
+| AmberChat 7B             | 1.25T          | 42.83 | 74.03     | 38.88         | 5.31  | 66.77              | 40.72      |     -              |       -       |
 | Combined Language and Coding Ability           |