sindhi-bert-base / tokenizer.json

Commit History

Revert: clean tokenizer matching SP model exactly
ee669ca
verified

hellosindh commited on

Fix: correct special token order and remove duplicate </s>
b3274ee
verified

hellosindh commited on

Revert: restore original vocab order
f67628e
verified

hellosindh commited on

Fix: swap mask token to correct index 32000 in Unigram vocab
00b38ab
verified

hellosindh commited on