Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Paper • 2311.03099 • Published • 30
This is quantized version of rityak/MN-RocinanteCelestar-12B created using llama.cpp
This is a merge of pre-trained language models created using mergekit.
This model was merged using the DARE TIES merge method using nothingiisreal/MN-12B-Celeste-V1.9 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: nothingiisreal/MN-12B-Celeste-V1.9
- model: nothingiisreal/MN-12B-Starcannon-v3
parameters:
density: 0.50
weight: 0.50
- model: TheDrummer/Rocinante-12B-v1
parameters:
density: 0.50
weight: 0.50
merge_method: dare_ties
base_model: nothingiisreal/MN-12B-Celeste-V1.9
parameters:
normalize: false
int8_mask: true
dtype: float32
out_dtype: bfloat16
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit