Update README.md
Browse files
README.md
CHANGED
|
@@ -31,12 +31,11 @@ VBART-XLarge improves the results compared to VBART-Large albeit in small margin
|
|
| 31 |
|
| 32 |
### Pre-training Data
|
| 33 |
The base model is pre-trained on [vngrs-web-corpus](https://huggingface.co/datasets/vngrs-ai/vngrs-web-corpus). It is curated by cleaning and filtering Turkish parts of [OSCAR-2201](https://huggingface.co/datasets/oscar-corpus/OSCAR-2201) and [mC4](https://huggingface.co/datasets/mc4) datasets. These datasets consist of documents of unstructured web crawl data. More information about the dataset can be found on their respective pages. Data is filtered using a set of heuristics and certain rules, explained in the appendix of our [paper](https://arxiv.org/abs/2403.01308).
|
| 34 |
-
#### Hardware
|
| 35 |
-
- **GPUs**: 8 x Nvidia A100-80 GB
|
| 36 |
#### Software
|
| 37 |
- TensorFlow
|
| 38 |
#### Pre-training Setting
|
| 39 |
-
- **Duration**: Pre-trained for
|
|
|
|
| 40 |
- **Training tokens**: 84B
|
| 41 |
- **Context Length**: 1024 for both encoder and decoder
|
| 42 |
- **Training regime:** fp16 mixed precision
|
|
|
|
| 31 |
|
| 32 |
### Pre-training Data
|
| 33 |
The base model is pre-trained on [vngrs-web-corpus](https://huggingface.co/datasets/vngrs-ai/vngrs-web-corpus). It is curated by cleaning and filtering Turkish parts of [OSCAR-2201](https://huggingface.co/datasets/oscar-corpus/OSCAR-2201) and [mC4](https://huggingface.co/datasets/mc4) datasets. These datasets consist of documents of unstructured web crawl data. More information about the dataset can be found on their respective pages. Data is filtered using a set of heuristics and certain rules, explained in the appendix of our [paper](https://arxiv.org/abs/2403.01308).
|
|
|
|
|
|
|
| 34 |
#### Software
|
| 35 |
- TensorFlow
|
| 36 |
#### Pre-training Setting
|
| 37 |
+
- **Duration**: Pre-trained for 8 days.
|
| 38 |
+
- **GPUs**: 8 x Nvidia A100-80 GB
|
| 39 |
- **Training tokens**: 84B
|
| 40 |
- **Context Length**: 1024 for both encoder and decoder
|
| 41 |
- **Training regime:** fp16 mixed precision
|