bert-tiny-ita is an italian foundational model (based on bert-tiny) pretrained from scratch on 20k italian Wikipedia articles and on a wide collection of italian words and dictionary definitions. It uses 512 context window size.

The project is still a work in progress, new versions will come with time.

Use it as a foundational model to be finetuned for specific italian tasks.

Training

epochs: 250
lr: 1e-5
optim: AdamW
weight_decay: 1e-4

Eval

perplexity: 45 (it's a 12MB model!)

Downloads last month: 102

Safetensors

Model size

3.06M params

Tensor type

F32

Model tree for mascIT/bert-tiny-ita

Finetunes

1 model