File size: 1,467 Bytes
c6063ad
 
a776cd1
c6063ad
d2a5483
c6063ad
d2a5483
 
c6063ad
a776cd1
c6063ad
d2a5483
c6063ad
d2a5483
 
 
 
 
c6063ad
a776cd1
c6063ad
d2a5483
 
c6063ad
d2a5483
 
c6063ad
a776cd1
c6063ad
d2a5483
a776cd1
 
 
d2a5483
 
a776cd1
d2a5483
 
 
a776cd1
d2a5483
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a776cd1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
library_name: transformers
license: mit
---
# Phi-4 SLERP Merge Model

## Model Description  
This is a merged language model created using the **Spherical Linear Interpolation (SLERP) merge method**, allowing for a smooth blend of features from both parent models across different layers. The merge optimizes reasoning, general knowledge, and task-specific performance by strategically interpolating attention and MLP components.

---

## Merge Details  

**Merge Method:**  
The model was merged using **SLERP (Spherical Linear Interpolation)** rather than a traditional linear merge, ensuring a well-balanced combination of both source models while maintaining coherent weight transitions.

**Base Model:**  
- **bunnycore/Phi-4-RR-Shoup** (used as the primary base)

---

## Models Merged  
The following models were included in this merge:

1. **bunnycore/Phi-4-RR-Shoup** (Primary base)  
2. **bunnycore/Phi-4-Model-Stock-v4**  

---

## Configuration  
The following YAML configuration was used to produce this merged model:

```yaml
slices:
- sources:
  - model: bunnycore/Phi-4-RR-Shoup
    layer_range:
    - 0
    - 32
  - model: bunnycore/Phi-4-Model-Stock-v4
    layer_range:
    - 0
    - 32
merge_method: slerp
base_model: bunnycore/Phi-4-RR-Shoup
parameters:
  t:
  - filter: self_attn
    value:
    - 0
    - 0.5
    - 0.3
    - 0.7
    - 1
  - filter: mlp
    value:
    - 1
    - 0.5
    - 0.7
    - 0.3
    - 0
  - value: 0.5
dtype: bfloat16