osanseviero
/

mistral-instruct-moe-experimental

Text Generation

Mixture of Experts

Eval Results (legacy)

text-generation-inference

Model card Files Files and versions

osanseviero commited on Jan 10, 2024

Commit

8b7e646

·

1 Parent(s): 71788ee

Create README.md

Files changed (1) hide show

README.md +43 -0

README.md ADDED Viewed

	@@ -0,0 +1,43 @@

+---
+base_model:
+- mistralai/Mistral-7B-Instruct-v0.2
+- mistralai/Mistral-7B-Instruct-v0.1
+tags:
+- mergekit
+- merge
+- moe
+---
+# Mistral Instruct MoE experimental
+This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit) using the `mixtral` branch.
+**This is an experimental model and has nothing to do with Mixtral. Mixtral is not a merge of models per se, but a transformer with MoE layers learned during training**
+This uses a random gate, so I expect not great results. We'll see!
+## Merge Details
+### Merge Method
+This model was merged using the MoE merge method.
+### Models Merged
+The following models were included in the merge:
+* [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
+* [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
+### Configuration
+The following YAML configuration was used to produce this model:
+```yaml
+base_model: mistralai/Mistral-7B-Instruct-v0.2
+gate_mode: random
+dtype: bfloat16
+experts:
+  - source_model: mistralai/Mistral-7B-Instruct-v0.2
+    positive_prompts: [""]
+  - source_model: mistralai/Mistral-7B-Instruct-v0.1
+    positive_prompts: [""]
+```