Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Ethosoft
/
NedoTurkishTokenizer

Token Classification
Transformers
Turkish
nedo-turkish-tokenizer
tokenizer
morphology
turkish
nlp
Model card Files Files and versions
xet
Community
NedoTurkishTokenizer
32.4 MB
Ctrl+K
Ctrl+K
  • 2 contributors
History: 16 commits
nmstech's picture
nmstech
Add smart ACRONYM detection: TDK-based disambiguation for uppercase tokens
fcd513a verified 22 days ago
  • .claude
    Fix broken placeholder mechanism: replace with segment-based tokenization 22 days ago
  • turk_tokenizer
    Add smart ACRONYM detection: TDK-based disambiguation for uppercase tokens 22 days ago
  • .gitattributes
    1.59 kB
    Initial release: TurkTokenizer v1.0.0 โ€” TR-MMLU 92% 22 days ago
  • .gitignore
    75 Bytes
    Fix broken placeholder mechanism: replace with segment-based tokenization 22 days ago
  • README.md
    6.89 kB
    Update model card with Use This Model section 22 days ago
  • pyproject.toml
    1.13 kB
    Fix build backend: setuptools.backends.legacy โ†’ setuptools.build_meta 22 days ago
  • tokenization_turk.py
    6.4 kB
    Add AutoTokenizer support (trust_remote_code) 22 days ago
  • tokenizer_config.json
    358 Bytes
    Add AutoTokenizer support (trust_remote_code) 22 days ago