fgaim
/

tiroberta-base

Model card Files Files and versions

fgaim commited on Jul 6, 2021

Commit

56143a6

·

1 Parent(s): d4d974d

Update README

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -1,15 +1,18 @@
 # RoBERTa Pretrained for Tigrinya Language
-We pretrain RoBERTa base on a relatively small dataset for Tigrinya (34M tokens).
 ## Hyperparameters
 The hyperparameters corresponding to model sizes mentioned above are as follows:
 | Model Size | L  | AH | HS  | FFN  | P    |
 |------------|----|----|-----|------|------|
 | BASE       | 12 | 12 | 768 | 3072 | 125M |
-(AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters.)

 # RoBERTa Pretrained for Tigrinya Language
+We pretrain a RoBERTa Base model on a relatively small dataset for Tigrinya (34M tokens) for 18 epochs.
+Contained in this card is a PyTorch model exported from the original model that was trained on TPU v3.8 with Flax.
 ## Hyperparameters
 The hyperparameters corresponding to model sizes mentioned above are as follows:
 | Model Size | L  | AH | HS  | FFN  | P    |
 |------------|----|----|-----|------|------|
 | BASE       | 12 | 12 | 768 | 3072 | 125M |
+(L = number of layers; AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters.)