EmbeddedLLM
/

Mistral-7B-Merge-14-v0.5

Text Generation

text-generation-inference

Model card Files Files and versions

Mistral-7B-Merge-14-v0.5 / README.md

thesunday's picture

Add Open LLM Leaderboard results

1e526f1 verified almost 2 years ago

|

history blame contribute delete

2.01 kB

	---
	license: cc-by-nc-4.0
	language:
	- en
	tags:
	- merge
	base_model:
	- EmbeddedLLM/Mistral-7B-Merge-14-v0.3
	- Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
	- openchat/openchat-3.5-0106
	- mlabonne/NeuralMarcoro14-7B
	---

	# Update 2024-01-21

	Due to [mlabonne/NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B) updating its license to CC-BY-NC, our license will follow suit.

	# Model Description

	This is an experiment to test merging 14 models using DARE TIES 🦙

	1. We first merge 14 models to produce [EmbeddedLLM/Mistral-7B-Merge-14-v0.3](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.3).
	2. The model is merged again using DARE TIES with:
	- [Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp](https://huggingface.co/Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp)
	- [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)
	- [mlabonne/NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)

	## Open LLM Leaderboard

	\| Average \| 71.96 \|
	\|------------\|-------\|
	\| ARC \| 68.69 \|
	\| HellaSwag \| 86.45 \|
	\| MMLU \| 65.65 \|
	\| TruthfulQA \| 59.12 \|
	\| Winogrande \| 80.66 \|
	\| GSM8K \| 71.19 \|

	## Chat Template

	Either ChatML or Llama-2 chat template.

	## Merge Configuration

	The merge config file for this model is here:

	```yaml
	models:
	- model: mistralai/Mistral-7B-v0.1
	# no parameters necessary for base model
	- model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3
	parameters:
	weight: 0.3
	density: 0.5
	- model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
	parameters:
	weight: 0.2
	density: 0.5
	- model: openchat/openchat-3.5-0106
	parameters:
	weight: 0.2
	density: 0.5
	- model: mlabonne/NeuralMarcoro14-7B
	parameters:
	weight: 0.3
	density: 0.5
	merge_method: dare_ties
	base_model: mistralai/Mistral-7B-v0.1

	parameters:
	int8_mask: true
	tokenizer_source: union
	dtype: bfloat16
	```