Revert: clean tokenizer_config matching training IDs b500d27 verified hellosindh commited on 26 days ago
Revert: clean tokenizer matching SP model exactly ee669ca verified hellosindh commited on 26 days ago
Fix: correct special token order and remove duplicate </s> b3274ee verified hellosindh commited on 26 days ago
Fix: swap mask token to correct index 32000 in Unigram vocab 00b38ab verified hellosindh commited on 26 days ago
Add standard SP filename for XLMRoberta compatibility 1592495 verified hellosindh commited on 26 days ago