Qwen2.5-1.5b-leetcode-math-task-arithmetic-freeze-embed-partial

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the Task Arithmetic merge method using /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Instruc-base as a base.

Models Merged

The following models were included in the merge:

  • /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Thinking-v1.1
  • /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Instruct_LeetCodeDataset

Configuration

The following YAML configuration was used to produce this model:

merge_method: task_arithmetic
dtype: bfloat16

base_model: /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Instruc-base

parameters:
  # mặc định: không lấy delta từ các model nguồn nếu không match filter
  weight: 0.0

models:
  # ===== Thinking: chỉ layers 20..27 =====
  - model: /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Thinking-v1.1
    parameters:
      weight:
        # freeze embedding/head
        - filter: model.embed_tokens
          value: 0.02
        - filter: lm_head
          value: 0.02

        # layers 20..27
        - filter: model.layers.20
          value: 0.03
        - filter: model.layers.21
          value: 0.03
        - filter: model.layers.22
          value: 0.03
        - filter: model.layers.23
          value: 0.03
        - filter: model.layers.24
          value: 0.03
        - filter: model.layers.25
          value: 0.03
        - filter: model.layers.26
          value: 0.03
        - filter: model.layers.27
          value: 0.03

  # ===== LeetCode: layers 12..27 =====
  - model: /content/drive/MyDrive/Khoá_luận_tốt_nghiệp/Model/Qwen2.5-1.5B-Instruct_LeetCodeDataset
    parameters:
      weight:
        # freeze embedding/head
        - filter: model.embed_tokens
          value: 0.08
        - filter: lm_head
          value: 0.08

        # layers 12..27
        - filter: model.layers.12
          value: 0.08
        - filter: model.layers.13
          value: 0.08
        - filter: model.layers.14
          value: 0.08
        - filter: model.layers.15
          value: 0.08
        - filter: model.layers.16
          value: 0.08
        - filter: model.layers.17
          value: 0.08
        - filter: model.layers.18
          value: 0.08
        - filter: model.layers.19
          value: 0.08
        - filter: model.layers.20
          value: 0.08
        - filter: model.layers.21
          value: 0.08
        - filter: model.layers.22
          value: 0.08
        - filter: model.layers.23
          value: 0.08
        - filter: model.layers.24
          value: 0.08
        - filter: model.layers.25
          value: 0.08
        - filter: model.layers.26
          value: 0.08
        - filter: model.layers.27
          value: 0.08
Downloads last month
40
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for quangdung/Qwen2.5-1.5b-leetcode-math-task-arithmetic-freeze-embed-partial