Update README.md

86c1e0d verified 10 months ago

1.21 kB

license: mit
datasets:
  - uonlp/CulturaX
language:
  - sq
pipeline_tag: fill-mask

Create README.md Albanian ALBERT model pretrained on around 16GB of text (I used uonlp/CulturaX's sq configuration) and 1.1 million training steps, using only the masked language modelling task. Trained on a TPU v4-32 pod, made possible through the Google TPU Research Cloud.

Hyperparameters:

Optimizer: LAMB
LR: 0.0006
$\beta_1$ : 0.9
$\beta_2$ : 0.999
$\epsilon$ : 1e-8
Batch size: 1024
Num. steps: 1.1 million
dtype: bfloat16
max. seq. length: 512

Going to post the model's performance evaluated on different Albanian downstream tasks once I'm done evaluating the model.

Classification Tasks

Task	Learning Rate	Number of epochs	Accuracy	Precision	Recall	F1 score
AlbMoRe[1].	1e-05	10	0.98	0.97	0.99	0.98

Regression Tasks

TODO

References

[1] Çano, E. (2023). Albmore: A corpus of movie reviews for sentiment analysis in albanian. arXiv preprint arXiv:2306.08526.