Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Ethosoft
/
NedoTurkishTokenizer

Model card Files Files and versions
xet
Community
NedoTurkishTokenizer
32.4 MB
Ctrl+K
Ctrl+K
  • 6 contributors
History: 17 commits
nmstech's picture
nmstech
Update README.md
92ffed4 verified about 1 month ago
  • .claude
    Fix broken placeholder mechanism: replace with segment-based tokenization about 1 month ago
  • turk_tokenizer
    Add smart ACRONYM detection: TDK-based disambiguation for uppercase tokens about 1 month ago
  • .gitattributes
    1.59 kB
    Initial release: TurkTokenizer v1.0.0 โ€” TR-MMLU 92% about 1 month ago
  • .gitignore
    75 Bytes
    Fix broken placeholder mechanism: replace with segment-based tokenization about 1 month ago
  • README.md
    6.89 kB
    Update README.md about 1 month ago
  • pyproject.toml
    1.13 kB
    Fix build backend: setuptools.backends.legacy โ†’ setuptools.build_meta about 1 month ago
  • tokenization_turk.py
    6.4 kB
    Add AutoTokenizer support (trust_remote_code) about 1 month ago
  • tokenizer_config.json
    358 Bytes
    Add AutoTokenizer support (trust_remote_code) about 1 month ago