metadata
license: apache-2.0
this is the official model used in dot, as of 09/2024, it consistently outperforms every sota model below 700m parameters, outperforms most 1b and is competitive with the best ones.
benchmarks
zero-shot evaluations performed on current sota ~0.5b models against the best language model below 2b parameters.
| Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
|---|---|---|---|---|---|---|---|
| 0.5b | qwen2 | 0.4413 | 0.2892 | 0.4905 | 0.6931 | 0.5699 | 0.4968 |
| 1.1b | palmer | 0.2661 | 0.3490 | 0.6173 | 0.7481 | 0.6417 | 0.5244 |
| 0.5b | arco | 0.2617 | 0.3729 | 0.6288 | 0.7437 | 0.6227 | 0.5260 |
| 6.7b | gptj | 0.2730 | 0.3660 | 0.6630 | 0.7620 | 0.6401 | 0.5408 |

