janakhpon
/

mon_tokenizer

Model card Files Files and versions

2.28 MB

Ctrl+K

Ctrl+K

1 contributor

History: 12 commits

janakhpon's picture

feat: update tokenizer artifacts with 41.4M character corpus

39ad643 2 months ago

.gitattributes

315 Bytes
feat: restructure and upgrade to 32k vocab model (v2) 2 months ago
.gitignore

522 Bytes
feat: restructure and upgrade to 32k vocab model (v2) 2 months ago
README.md

2.23 kB
feat: update tokenizer artifacts with 41.4M character corpus 2 months ago
added_tokens.json

21 Bytes
feat: restructure and upgrade to 32k vocab model (v2) 2 months ago
mon_tokenizer.vocab

1 MB
xet

feat: update tokenizer artifacts with 41.4M character corpus 2 months ago
special_tokens_map.json

552 Bytes
feat: simplified mon tokenizer in hf format, updated tags, resolve the legacy issue 10 months ago
tokenizer.json

278 kB
feat: simplified mon tokenizer in hf format, updated tags, resolve the legacy issue 10 months ago
tokenizer.model

996 kB
xet

feat: update tokenizer artifacts with 41.4M character corpus 2 months ago
tokenizer_config.json

1.14 kB
feat: restructure and upgrade to 32k vocab model (v2) 2 months ago