BPE-tokenizer-32k / tokenizer_config.json
tgsc's picture
Upload 3 files
1d44361
raw
history blame contribute delete
219 Bytes
{
"clean_up_tokenization_spaces": true,
"model_max_length": 2048,
"special_tokens": [
"<|pad|>",
"<|endoftext|>",
"<|beginoftext|>",
"<|unk|>"
],
"tokenizer_class": "PreTrainedTokenizerFast"
}