Update README.md
Browse files
README.md
CHANGED
|
@@ -25,4 +25,14 @@ instructions = ["[INST] What is your favourite condiment? [/INST]",
|
|
| 25 |
|
| 26 |
encodeds = [tokenizer.encode(instruction, add_special_tokens=i==0)
|
| 27 |
for i, instruction in enumerate(instructions)]
|
| 28 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
encodeds = [tokenizer.encode(instruction, add_special_tokens=i==0)
|
| 27 |
for i, instruction in enumerate(instructions)]
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
## Model Architecture
|
| 31 |
+
This instruction model is based on Mistral-7B-v0.1, a transformer model with the following architecture choices:
|
| 32 |
+
- Grouped-Query Attention
|
| 33 |
+
- Sliding-Window Attention
|
| 34 |
+
- Byte-fallback BPE tokenizer
|
| 35 |
+
|
| 36 |
+
## The Mistral AI Team
|
| 37 |
+
|
| 38 |
+
Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
|