| license: apache-2.0 | |
| language: | |
| - en | |
| tags: | |
| - merge | |
| base_model: | |
| - Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp | |
| - EmbeddedLLM/Mistral-7B-Merge-14-v0.3 | |
| # Model Description | |
| This is an experiment to test merging 14 models using DARE TIES 🦙 | |
| We first merge 14 models to produce [EmbeddedLLM/Mistral-7B-Merge-14-v0.3](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.3), | |
| which is then merged again with [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp) using Gradient SLERP. | |
| The result is a model that performs quite well but may require further instruction fine-tuning. | |
| ## Open LLM Leaderboard | |
| | Average | 71.19 | | |
| |------------|-------| | |
| | ARC | 66.81 | | |
| | HellaSwag | 86.15 | | |
| | MMLU | 65.10 | | |
| | TruthfulQA | 58.25 | | |
| | Winogrande | 80.03 | | |
| | GSM8K | 70.81 | | |
| ## Chat Template | |
| Either ChatML or Llama-2 chat template. | |
| ## Merge Configuration | |
| The merge config file for this model is here: | |
| ```yaml | |
| slices: | |
| - sources: | |
| - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp | |
| layer_range: [0, 32] | |
| - model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3 | |
| layer_range: [0, 32] | |
| merge_method: slerp | |
| base_model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp | |
| parameters: | |
| t: | |
| - filter: self_attn | |
| value: [0, 0.5, 0.3, 0.7, 1] | |
| - filter: mlp | |
| value: [1, 0.5, 0.7, 0.3, 0] | |
| - value: 0.5 # fallback for rest of tensors | |
| tokenizer_source: base | |
| embed_slerp: true | |
| dtype: bfloat16 | |
| ``` |