Update README.md
Browse files
README.md
CHANGED
|
@@ -12,9 +12,8 @@ An experimentation regarding 'lasering' each expert to denoise and enhance model
|
|
| 12 |
|
| 13 |
This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score).
|
| 14 |
|
| 15 |
-
Used models (all lasered using laserRMT, except for the base model):
|
| 16 |
|
| 17 |
-
# Laserxtral - 4x7b
|
| 18 |
|
| 19 |
This model is a Mixture of Experts (MoE) made with [mergekit](https://github.com/cg123/mergekit) (mixtral branch). It uses the following base models:
|
| 20 |
* [cognitivecomputations/dolphin-2.6-mistral-7b-dpo](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo)
|
|
|
|
| 12 |
|
| 13 |
This model has half size in comparison to the Mixtral 8x7b Instruct. And it basically has the same level of performance (we are working to get a better MMLU score).
|
| 14 |
|
|
|
|
| 15 |
|
| 16 |
+
# Laserxtral - 4x7b (all lasered using laserRMT)
|
| 17 |
|
| 18 |
This model is a Mixture of Experts (MoE) made with [mergekit](https://github.com/cg123/mergekit) (mixtral branch). It uses the following base models:
|
| 19 |
* [cognitivecomputations/dolphin-2.6-mistral-7b-dpo](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo)
|