Resolving Interference When Merging Models
Paper
• 2306.01708 • Published
• 17
This is a merge of pre-trained language models created using mergekit.
This model was merged using the TIES merge method using google/gemma-2-2b as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
base_model: google/gemma-2-2b
dtype: bfloat16
merge_method: ties
parameters:
int8_mask: 1.0
normalize: 1.0
slices:
- sources:
- layer_range: [0, 26]
model: google/gemma-2-2b
- layer_range: [0, 26]
model: google/gemma-2-2b-it
parameters:
density:
- filter: self_attn.o_proj.9
value: 1.0
- value: 0.001
weight:
- value: 1.0
tokenizer_source: union