flexitok/mod-tokenizers
Collection
11 items β’ Updated
A merged super-vocabulary built from 9 tokenizer(s).
Vocab size: 100007
flexitok/mod-tokenizers-individualflexitok/mod-tokenizers-ltr_3digitflexitok/mod-tokenizers-ltr_2digitflexitok/mod-tokenizers-ltr_4digitflexitok/mod-tokenizers-ltr_5digitflexitok/mod-tokenizers-rtl_2digitflexitok/mod-tokenizers-rtl_3digitflexitok/mod-tokenizers-rtl_4digitflexitok/mod-tokenizers-rtl_5digitsuper_vocab.json β merged vocabulary mapping token string β super indexconfig.yaml β model config with vocab_sizeparticipating_tokenizers.json β list of tokenizer names included<tokenizer>_super_mapping.json β per-tokenizer index β super index mapping<tokenizer>_vocab.json β per-tokenizer vocabulary<tokenizer>_info.json / .yaml β tokenizer metadata