Sindhi-BERT-base
First BERT-style model trained from scratch on Sindhi text.
Training History
| Session | Data | Epochs | PPL | Notes |
|---|---|---|---|---|
| S1 | 500K lines | 5 | 78.10 | from scratch |
| S2 | 1.5M lines | 3 | 41.62 | continued |
| S3 | 87M words | 2 | 28.46 | bf16, cosine LR |
| S4 | 87M words | 3 | 35.16 | grouped context, MLM=0.20 |
| S5 | 87M words | 2 | 29.67 | fine polish, MLM=0.15 |
| S6r | 149M words | 2 | 31.66 | grouping=80, LR=5e-6 |
Usage
from transformers import RobertaForMaskedLM
import sentencepiece as spm, torch
import torch.nn.functional as F
from huggingface_hub import hf_hub_download
REPO = "hellosindh/sindhi-bert-base"
MASK_ID = 32000
BOS_ID = 2
EOS_ID = 3
model = RobertaForMaskedLM.from_pretrained(REPO)
sp_path = hf_hub_download(REPO, "sindhi_bpe_32k.model")
sp = spm.SentencePieceProcessor()
sp.Load(sp_path)
- Downloads last month
- 246
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support