YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Bantam Tokenizer

BPE tokenizer with 128,016 vocabulary entries (128,000 base + 16 special tokens).

Special tokens

Token ID String
unk_token 128000 <|UNK|>
pad_token 128001 <|PAD|>
bos_token 128002 <|BOS|>
eos_token 128003 <|EOS|>
128004 <|SYSTEM_START|>
128005 <|SYSTEM_END|>
128006 <|USER_START|>
128007 <|USER_END|>
128008 <|AGENT_START|>
128009 <|AGENT_END|>
128010 <|THINK_START|>
128011 <|THINK_END|>
128012 <|COMPUTE_START|>
128013 <|COMPUTE_END|>
128014 <|IMAGE_START|>
128015 <|IMAGE_END|>

Config quick-reference

pad_token_id: 128001
bos_token_id: 128002
eos_token_id: 128003

Usage

from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("path/to/bantam-tokenizer")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support