metadata
license: apache-2.0
cubby consistently outperforms every sota model below 600m parameters, outperforms base 1b models and is competitive with the best ones. cubby is a merge of multiple internal models fine-tuned on a diverse set of styles and finally merged with the several models, followed by a merge with base model to preserve knowledge.
benchmarks
zero-shot evaluations performed on current sota ~0.5b models and palmer-004.
| Parameters | Model | MMLU | ARC-C | HellaSwag | PIQA | Winogrande | Average |
|---|---|---|---|---|---|---|---|
| 0.5b | qwen2 | 0.4413 | 0.2892 | 0.4905 | 0.6931 | 0.5699 | 0.4968 |
| 0.5b | palmer-004-turbo | 0.2736 | 0.3558 | 0.6179 | 0.7367 | 0.6117 | 0.5191 |
| 1.1b | palmer-004 | 0.2661 | 0.3490 | 0.6173 | 0.7481 | 0.6417 | 0.5244 |
| 0.5b | arco | 0.2617 | 0.3729 | 0.6288 | 0.7437 | 0.6227 | 0.5260 |

