FuseChat: Knowledge Fusion of Chat Models
Paper
•
2408.07990
•
Published
•
14
Experimental merge of multiple Llama 3.2 3B models, guided by MoonRide-Index-v7. Created with mergekit.
This model was merged using the SCE merge method using meta-llama/Llama-3.2-3B as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: bunnycore/Llama-3.2-3B-Mix-Skill
- model: bunnycore/Llama-3.2-3B-Sci-Think
- model: FuseAI/FuseChat-Llama-3.2-3B-Instruct
- model: theprint/ReWiz-Llama-3.2-3B
base_model: meta-llama/Llama-3.2-3B
tokenizer:
source: meta-llama/Llama-3.2-3B-Instruct
merge_method: sce
parameters:
normalize: true
dtype: float32
out_dtype: float16
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 20.14 |
| IFEval (0-Shot) | 49.25 |
| BBH (3-Shot) | 22.69 |
| MATH Lvl 5 (4-Shot) | 16.16 |
| GPQA (0-shot) | 3.69 |
| MuSR (0-shot) | 5.50 |
| MMLU-PRO (5-shot) | 23.57 |