|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: cc-by-nc-4.0 |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- transformers |
|
|
- unsloth |
|
|
- mistral |
|
|
- trl |
|
|
base_model: alnrg2arg/blockchainlabs_7B_merged_test2_4 |
|
|
datasets: |
|
|
- Open-Orca/SlimOrca |
|
|
--- |
|
|
|
|
|
Benchmark Scores |
|
|
|
|
|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr| |
|
|
|-------------|------:|------|-----:|--------|-----:|---|-----:| |
|
|
|arc_challenge| 1|none | 0|acc |0.5247|± |0.0146| |
|
|
| | |none | 0|acc_norm|0.5623|± |0.0145| |
|
|
|
|
|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr| |
|
|
|---------|------:|------|-----:|--------|-----:|---|-----:| |
|
|
|hellaswag| 1|none | 0|acc |0.6270|± |0.0048| |
|
|
| | |none | 0|acc_norm|0.8228|± |0.0038| |
|
|
|
|
|
| Groups |Version|Filter|n-shot|Metric|Value | |Stderr| |
|
|
|------------------|-------|------|-----:|------|-----:|---|-----:| |
|
|
|mmlu |N/A |none | 0|acc |0.6243|± |0.1341| |
|
|
| - humanities |N/A |none | 0|acc |0.5717|± |0.1400| |
|
|
| - other |N/A |none | 0|acc |0.7016|± |0.1143| |
|
|
| - social_sciences|N/A |none | 0|acc |0.7342|± |0.0753| |
|
|
| - stem |N/A |none | 0|acc |0.5192|± |0.1257| |
|
|
|
|
|
| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr| |
|
|
|----------|------:|------|-----:|------|-----:|---|-----:| |
|
|
|winogrande| 1|none | 0|acc |0.7774|± |0.0117| |
|
|
|
|
|
|Tasks|Version| Filter |n-shot| Metric |Value | |Stderr| |
|
|
|-----|------:|----------|-----:|-----------|-----:|---|-----:| |
|
|
|gsm8k| 2|get-answer| 5|exact_match|0.6732|± |0.0129| |
|
|
|
|
|
| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr| |
|
|
|--------------|------:|------|-----:|------|-----:|---|-----:| |
|
|
|truthfulqa_mc2| 2|none | 0|acc |0.4795|± |0.0148| |
|
|
|
|
|
Average 65.658 |