Resolving Interference When Merging Models
Paper
• 2306.01708 • Published
• 17
This is a merge of pre-trained language models created using mergekit.
This model was merged using the TIES merge method using OdiaGenAI/mistral_hindi_7b_base_v1 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: OdiaGenAI/mistral_hindi_7b_base_v1
# no parameters necessary for base model
- model: BioMistral/BioMistral-7B
parameters:
density: 0.7
weight: [0.1, 0.3, 0.6, 0.7] # weight gradient
merge_method: ties
base_model: OdiaGenAI/mistral_hindi_7b_base_v1
parameters:
normalize: true
int8_mask: true
dtype: bfloat16