File size: 1,467 Bytes
c6063ad a776cd1 c6063ad d2a5483 c6063ad d2a5483 c6063ad a776cd1 c6063ad d2a5483 c6063ad d2a5483 c6063ad a776cd1 c6063ad d2a5483 c6063ad d2a5483 c6063ad a776cd1 c6063ad d2a5483 a776cd1 d2a5483 a776cd1 d2a5483 a776cd1 d2a5483 a776cd1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
---
library_name: transformers
license: mit
---
# Phi-4 SLERP Merge Model
## Model Description
This is a merged language model created using the **Spherical Linear Interpolation (SLERP) merge method**, allowing for a smooth blend of features from both parent models across different layers. The merge optimizes reasoning, general knowledge, and task-specific performance by strategically interpolating attention and MLP components.
---
## Merge Details
**Merge Method:**
The model was merged using **SLERP (Spherical Linear Interpolation)** rather than a traditional linear merge, ensuring a well-balanced combination of both source models while maintaining coherent weight transitions.
**Base Model:**
- **bunnycore/Phi-4-RR-Shoup** (used as the primary base)
---
## Models Merged
The following models were included in this merge:
1. **bunnycore/Phi-4-RR-Shoup** (Primary base)
2. **bunnycore/Phi-4-Model-Stock-v4**
---
## Configuration
The following YAML configuration was used to produce this merged model:
```yaml
slices:
- sources:
- model: bunnycore/Phi-4-RR-Shoup
layer_range:
- 0
- 32
- model: bunnycore/Phi-4-Model-Stock-v4
layer_range:
- 0
- 32
merge_method: slerp
base_model: bunnycore/Phi-4-RR-Shoup
parameters:
t:
- filter: self_attn
value:
- 0
- 0.5
- 0.3
- 0.7
- 1
- filter: mlp
value:
- 1
- 0.5
- 0.7
- 0.3
- 0
- value: 0.5
dtype: bfloat16
|