DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Paper
•
2406.11617
•
Published
•
8
A merge of some of the best RP finetunes the community has to offer.
Note: This model can be very creative, due to this I wouldn't recommend a temp over 0.7.
This model was merged using the DELLA merge method using mistralai/Mistral-Small-3.2-24B-Instruct-2506 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: mistralai/Mistral-Small-3.2-24B-Instruct-2506
- model: zerofata/MS3.2-PaintedFantasy-v3-24B # Pink/Red
parameters:
weight: 0.3
density: 1.0
epsilon: 0.0
- model: TheDrummer/Cydonia-24B-v4 # Black
parameters:
weight: 0.3
density: 0.4
epsilon: 0.25
- model: CrucibleLab/M3.2-24B-Loki-V1.3 # Green
parameters:
weight: 0.3
density: 0.4
epsilon: 0.25
- model: Gryphe/Codex-24B-Small-3.2 # Blue (Yes, blue. Fuck you.)
parameters:
weight: 0.3
density: 0.4
epsilon: 0.25
- model: ReadyArt/Dark-Nexus-24B-v2.0 # Orange
parameters:
weight: 0.3
density: 0.4
epsilon: 0.25
- model: Doctor-Shotgun/MS3.2-24B-Magnum-Diamond # White
parameters:
weight: 0.3
density: 0.4
epsilon: 0.25
# Seed: 199803
merge_method: della
parameters:
lambda: 0.9
normalize: true
int8_mask: false
base_model: mistralai/Mistral-Small-3.2-24B-Instruct-2506
dtype: float32
out_dtype: bfloat16
tokenizer:
source: zerofata/MS3.2-PaintedFantasy-v3-24B