Phi-4-RRStock / README.md

FINGU-AI

Update README.md

d2a5483 verified 11 months ago

preview code

raw

history blame contribute delete

1.47 kB

metadata

library_name: transformers
license: mit

Phi-4 SLERP Merge Model

Model Description

This is a merged language model created using the Spherical Linear Interpolation (SLERP) merge method, allowing for a smooth blend of features from both parent models across different layers. The merge optimizes reasoning, general knowledge, and task-specific performance by strategically interpolating attention and MLP components.

Merge Details

Merge Method:
The model was merged using SLERP (Spherical Linear Interpolation) rather than a traditional linear merge, ensuring a well-balanced combination of both source models while maintaining coherent weight transitions.

Base Model:

bunnycore/Phi-4-RR-Shoup (used as the primary base)

Models Merged

The following models were included in this merge:

bunnycore/Phi-4-RR-Shoup (Primary base)
bunnycore/Phi-4-Model-Stock-v4

Configuration

The following YAML configuration was used to produce this merged model:

slices:
- sources:
  - model: bunnycore/Phi-4-RR-Shoup
    layer_range:
    - 0
    - 32
  - model: bunnycore/Phi-4-Model-Stock-v4
    layer_range:
    - 0
    - 32
merge_method: slerp
base_model: bunnycore/Phi-4-RR-Shoup
parameters:
  t:
  - filter: self_attn
    value:
    - 0
    - 0.5
    - 0.3
    - 0.7
    - 1
  - filter: mlp
    value:
    - 1
    - 0.5
    - 0.7
    - 0.3
    - 0
  - value: 0.5
dtype: bfloat16