turk-tokenizer / tokenizer_config.json
nmstech's picture
Initial release: TurkTokenizer v1.0.0 — TR-MMLU 92%
ca41c16 verified
{
"tokenizer_class": "TurkTokenizer",
"model_type": "turk-tokenizer",
"version": "1.0.0",
"description": "Turkish morphological tokenizer — TR-MMLU world record 92%",
"language": "tr",
"authors": "Ethosoft",
"requires_java": true,
"dependencies": ["turkish-tokenizer", "jpype1"]
}