File size: 984 Bytes
f0dbe60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---
license: apache-2.0
tags:
- merge
- mergekit
- lazymergekit
- prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M
- deepseek-ai/DeepSeek-R1-Distill-Llama-8B
---

# KRDModel

KRDModel is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
* [prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M](https://huggingface.co/prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M)
* [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B)

## 🧩 Configuration

```yaml
slices:
- sources:
  - model: prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M
    layer_range:
    - 0
    - 32
  - model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
    layer_range:
    - 0
    - 32
merge_method: slerp
base_model: prithivMLmods/Qwen2.5-14B-DeepSeek-R1-1M
parameters:
  t:
  - filter: self_attn
    value:
    - 0
    - 0.5
    - 0.3
    - 0.7
    - 1
  - filter: mlp
    value:
    - 1
    - 0.5
    - 0.7
    - 0.3
    - 0
  - value: 0.5
dtype: bfloat16

```