Editing Models with Task Arithmetic
Paper
•
2212.04089
•
Published
•
7
This is a merge of pre-trained language models created using mergekit.
This model was merged using the Task Arithmetic merge method using /home/pinzuli/models/ministral/Ministral-3-3B-Base-2512/ as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
# Chat Vector Merge using Task Arithmetic
#
# This config applies the "chat vector" from an instruct model to a target model.
# The chat vector is computed as: chat_model - base_model
# Then applied as: target_model + weight * (chat_model - base_model)
#
# Since we want to apply the chat vector to our target model (checkpoint),
# we set the checkpoint as the base_model and only include the chat_model.
# The task vector (chat_model - base_model) will be computed and added back.
models:
# The instruct/chat model that provides the "chat capabilities"
- model: /home/pinzuli/models/ministral/Ministral-3-3B-Reasoning-2512
parameters:
weight:
# 1.0
- filter: embed_tokens
value: 0
- filter: lm_head
value: 0
- value: 1.0
# The target model (your fine-tuned checkpoint)
- model: /home/pinzuli/swift_trainer/output/ministral-3-3b-base_cp-v2.0.0
parameters:
weight: 1.0
merge_method: task_arithmetic
base_model: /home/pinzuli/models/ministral/Ministral-3-3B-Base-2512/
tokenizer_source: /home/pinzuli/models/ministral/Ministral-3-3B-Reasoning-2512
dtype: bfloat16