Editing Models with Task Arithmetic
Paper • 2212.04089 • Published • 7
This is a merge of pre-trained language models created using mergekit.
This model was merged using the Task Arithmetic merge method using /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Instruc-base as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
merge_method: task_arithmetic
dtype: bfloat16
base_model: /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Instruc-base
parameters:
# mặc định: không lấy delta từ các model nguồn nếu không match filter
weight: 0.0
models:
# ===== Thinking: chỉ layers 20..27 =====
- model: /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Thinking-v1.1
parameters:
weight:
# freeze embedding/head
- filter: model.embed_tokens
value: 0.02
- filter: lm_head
value: 0.02
# layers 20..27
- filter: model.layers.20
value: 0.03
- filter: model.layers.21
value: 0.03
- filter: model.layers.22
value: 0.03
- filter: model.layers.23
value: 0.03
- filter: model.layers.24
value: 0.03
- filter: model.layers.25
value: 0.03
- filter: model.layers.26
value: 0.03
- filter: model.layers.27
value: 0.03
# ===== LeetCode: layers 12..27 =====
- model: /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Instruct_LeetCodeDataset
parameters:
weight:
# freeze embedding/head
- filter: model.embed_tokens
value: 0.08
- filter: lm_head
value: 0.08
# layers 12..27
- filter: model.layers.12
value: 0.08
- filter: model.layers.13
value: 0.08
- filter: model.layers.14
value: 0.08
- filter: model.layers.15
value: 0.08
- filter: model.layers.16
value: 0.08
- filter: model.layers.17
value: 0.08
- filter: model.layers.18
value: 0.08
- filter: model.layers.19
value: 0.08
- filter: model.layers.20
value: 0.08
- filter: model.layers.21
value: 0.08
- filter: model.layers.22
value: 0.08
- filter: model.layers.23
value: 0.08
- filter: model.layers.24
value: 0.08
- filter: model.layers.25
value: 0.08
- filter: model.layers.26
value: 0.08
- filter: model.layers.27
value: 0.08