Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Paper
•
2203.05482
•
Published
•
7
Qwen3-4B-Mixture is a merged language model built upon the Qwen3-4B architecture. This model is a result of combining several fine-tuned versions of Qwen3-4B, with a particular emphasis on incorporating "uncensored" or less restricted variants. The intention behind this merger is to potentially enhance the model's breadth of knowledge and reduce certain inherent biases or limitations often found in more heavily moderated models.
This model was merged using the Linear merge method.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: mlabonne/Qwen3-4B-abliterated
parameters:
weight: 0.4
- model: ValiantLabs/Qwen3-4B-Esper3
parameters:
weight: 0.5
- model: fakezeta/amoral-Qwen3-4B
parameters:
weight: 0.6
- model: Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v1
parameters:
weight: 0.4
merge_method: linear
normalize: false
int8_mask: true
dtype: bfloat16