projecte-aina
/

roberta-base-ca-v2

RoBERTa-base-ca-v2

Catalan Textual Corpus

Model card Files Files and versions

carmentano commited on Jun 2, 2022

Commit

97066bf

·

1 Parent(s): 90a670b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ The training corpus has been tokenized using a byte version of [Byte-Pair Encodi
 used in the original [RoBERTA](https://github.com/pytorch/fairseq/tree/master/examples/roberta) model with a vocabulary size of 52,000 tokens.
 The RoBERTa-ca-v2 pretraining consists of a masked language model training that follows the approach employed for the RoBERTa base model
 with the same hyperparameters as in the original work.
-The training lasted a total of 48 hours with 16 NVIDIA V100 GPUs of 16GB DDRAM.
 ## Training corpora and preprocessing

 used in the original [RoBERTA](https://github.com/pytorch/fairseq/tree/master/examples/roberta) model with a vocabulary size of 52,000 tokens.
 The RoBERTa-ca-v2 pretraining consists of a masked language model training that follows the approach employed for the RoBERTa base model
 with the same hyperparameters as in the original work.
+The training lasted a total of 96 hours with 16 NVIDIA V100 GPUs of 16GB DDRAM.
 ## Training corpora and preprocessing