|
|
--- |
|
|
license: cc-by-nc-4.0 |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- merge |
|
|
base_model: |
|
|
- EmbeddedLLM/Mistral-7B-Merge-14-v0.3 |
|
|
- Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp |
|
|
- openchat/openchat-3.5-0106 |
|
|
- mlabonne/NeuralMarcoro14-7B |
|
|
--- |
|
|
|
|
|
# Update 2024-01-21 |
|
|
|
|
|
Due to [mlabonne/NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B) updating its license to CC-BY-NC, our license will follow suit. |
|
|
|
|
|
# Model Description |
|
|
|
|
|
This is an experiment to test merging 14 models using DARE TIES 🦙 |
|
|
|
|
|
1. We first merge 14 models to produce [EmbeddedLLM/Mistral-7B-Merge-14-v0.3](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.3). |
|
|
2. The model is merged again using DARE TIES with: |
|
|
- [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp) |
|
|
- [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) |
|
|
- [mlabonne/NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B) |
|
|
|
|
|
## Open LLM Leaderboard |
|
|
|
|
|
| Average | 71.96 | |
|
|
|------------|-------| |
|
|
| ARC | 68.69 | |
|
|
| HellaSwag | 86.45 | |
|
|
| MMLU | 65.65 | |
|
|
| TruthfulQA | 59.12 | |
|
|
| Winogrande | 80.66 | |
|
|
| GSM8K | 71.19 | |
|
|
|
|
|
## Chat Template |
|
|
|
|
|
Either ChatML or Llama-2 chat template. |
|
|
|
|
|
## Merge Configuration |
|
|
|
|
|
The merge config file for this model is here: |
|
|
|
|
|
```yaml |
|
|
models: |
|
|
- model: mistralai/Mistral-7B-v0.1 |
|
|
# no parameters necessary for base model |
|
|
- model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3 |
|
|
parameters: |
|
|
weight: 0.3 |
|
|
density: 0.5 |
|
|
- model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp |
|
|
parameters: |
|
|
weight: 0.2 |
|
|
density: 0.5 |
|
|
- model: openchat/openchat-3.5-0106 |
|
|
parameters: |
|
|
weight: 0.2 |
|
|
density: 0.5 |
|
|
- model: mlabonne/NeuralMarcoro14-7B |
|
|
parameters: |
|
|
weight: 0.3 |
|
|
density: 0.5 |
|
|
merge_method: dare_ties |
|
|
base_model: mistralai/Mistral-7B-v0.1 |
|
|
|
|
|
parameters: |
|
|
int8_mask: true |
|
|
tokenizer_source: union |
|
|
dtype: bfloat16 |
|
|
``` |