Update README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ The tokenizer of this model after adaptation is the same as [Minverva-3B](https:
|
|
| 26 |
|
| 27 |
## Data used for the adaptation
|
| 28 |
|
| 29 |
-
The **Mistral-7B-v0.1-Adapted**
|
| 30 |
The data are extracted to be skewed toward Italian language with a ration of one over four. Extracting the first 9B tokens from Italian part of CulturaX and the first 3B tokens from English part of CulturaX.
|
| 31 |
|
| 32 |
|
|
|
|
| 26 |
|
| 27 |
## Data used for the adaptation
|
| 28 |
|
| 29 |
+
The **Mistral-7B-v0.1-Adapted** models are trained on a collection of Italian and English data extracted from [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX).
|
| 30 |
The data are extracted to be skewed toward Italian language with a ration of one over four. Extracting the first 9B tokens from Italian part of CulturaX and the first 3B tokens from English part of CulturaX.
|
| 31 |
|
| 32 |
|