Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Ethosoft
/
NedoTurkishTokenizer
like
1
Follow
Ethosoft
4
Model card
Files
Files and versions
xet
Community
92ffed4
NedoTurkishTokenizer
32.4 MB
Ctrl+K
Ctrl+K
6 contributors
History:
17 commits
nmstech
Update README.md
92ffed4
verified
about 1 month ago
.claude
Fix broken placeholder mechanism: replace with segment-based tokenization
about 1 month ago
turk_tokenizer
Add smart ACRONYM detection: TDK-based disambiguation for uppercase tokens
about 1 month ago
.gitattributes
Safe
1.59 kB
Initial release: TurkTokenizer v1.0.0 โ TR-MMLU 92%
about 1 month ago
.gitignore
Safe
75 Bytes
Fix broken placeholder mechanism: replace with segment-based tokenization
about 1 month ago
README.md
6.89 kB
Update README.md
about 1 month ago
pyproject.toml
Safe
1.13 kB
Fix build backend: setuptools.backends.legacy โ setuptools.build_meta
about 1 month ago
tokenization_turk.py
Safe
6.4 kB
Add AutoTokenizer support (trust_remote_code)
about 1 month ago
tokenizer_config.json
Safe
358 Bytes
Add AutoTokenizer support (trust_remote_code)
about 1 month ago