Resolving Interference When Merging Models
Paper
•
2306.01708
•
Published
•
15
This is a merge of pre-trained language models created using mergekit.
This model was merged using the TIES merge method using allenai/Olmo-3-1025-7B as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
# Magnitude pruning on OLMo-3.1-7B-RL-Zero-Math task vector
#
# Logic:
# 1. Compute task vector: delta = finetuned - base
# 2. Keep top 50% of delta by absolute magnitude, zero the rest
# 3. Output = base + pruned_delta
#
# Usage:
# modal run modal_merge.py --config examples/olmo-math-magnitude-prune.yaml --hf-repo pmahdavi/Olmo-3.1-7B-Math-Pruned-50
merge_method: ties
base_model: allenai/Olmo-3-1025-7B
models:
- model: allenai/Olmo-3.1-7B-RL-Zero-Math
parameters:
weight: 1.0
density: 0.5 # Keep top 50% by magnitude
parameters:
normalize: false # Don't normalize (single model, no effect anyway)
dtype: bfloat16