Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Ethosoft
/
NedoTurkishTokenizer
like
1
Follow
Ethosoft
3
Token Classification
Transformers
Turkish
nedo-turkish-tokenizer
tokenizer
morphology
turkish
nlp
License:
mit
Model card
Files
Files and versions
xet
Community
Use this model
fcd513a
NedoTurkishTokenizer
32.4 MB
Ctrl+K
Ctrl+K
2 contributors
History:
16 commits
nmstech
Add smart ACRONYM detection: TDK-based disambiguation for uppercase tokens
fcd513a
verified
22 days ago
.claude
Fix broken placeholder mechanism: replace with segment-based tokenization
22 days ago
turk_tokenizer
Add smart ACRONYM detection: TDK-based disambiguation for uppercase tokens
22 days ago
.gitattributes
Safe
1.59 kB
Initial release: TurkTokenizer v1.0.0 โ TR-MMLU 92%
22 days ago
.gitignore
Safe
75 Bytes
Fix broken placeholder mechanism: replace with segment-based tokenization
22 days ago
README.md
Safe
6.89 kB
Update model card with Use This Model section
22 days ago
pyproject.toml
Safe
1.13 kB
Fix build backend: setuptools.backends.legacy โ setuptools.build_meta
22 days ago
tokenization_turk.py
Safe
6.4 kB
Add AutoTokenizer support (trust_remote_code)
22 days ago
tokenizer_config.json
Safe
358 Bytes
Add AutoTokenizer support (trust_remote_code)
22 days ago