Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ pipeline_tag: token-classification
|
|
| 14 |
|
| 15 |
# NedoTurkishTokenizer
|
| 16 |
|
| 17 |
-
**Turkish morphological tokenizer — TR-MMLU world record
|
| 18 |
|
| 19 |
NedoTurkishTokenizer performs linguistically-aware tokenization of Turkish text using morphological rules. Unlike BPE-based tokenizers, it produces meaningful morphological units (roots and suffixes) aligned with Turkish grammar, powered by [Zemberek NLP](https://github.com/ahmetaa/zemberek-nlp).
|
| 20 |
|
|
@@ -25,7 +25,7 @@ NedoTurkishTokenizer performs linguistically-aware tokenization of Turkish text
|
|
| 25 |
| **Developer** | [Ethosoft](https://huggingface.co/Ethosoft) |
|
| 26 |
| **Language** | Turkish (`tr`) |
|
| 27 |
| **License** | MIT |
|
| 28 |
-
| **Benchmark** | TR-MMLU **
|
| 29 |
| **Morphological engine** | Zemberek NLP (bundled) |
|
| 30 |
|
| 31 |
---
|
|
|
|
| 14 |
|
| 15 |
# NedoTurkishTokenizer
|
| 16 |
|
| 17 |
+
**Turkish morphological tokenizer — TR-MMLU world record 92.64%**
|
| 18 |
|
| 19 |
NedoTurkishTokenizer performs linguistically-aware tokenization of Turkish text using morphological rules. Unlike BPE-based tokenizers, it produces meaningful morphological units (roots and suffixes) aligned with Turkish grammar, powered by [Zemberek NLP](https://github.com/ahmetaa/zemberek-nlp).
|
| 20 |
|
|
|
|
| 25 |
| **Developer** | [Ethosoft](https://huggingface.co/Ethosoft) |
|
| 26 |
| **Language** | Turkish (`tr`) |
|
| 27 |
| **License** | MIT |
|
| 28 |
+
| **Benchmark** | TR-MMLU **92.64%** (world record) |
|
| 29 |
| **Morphological engine** | Zemberek NLP (bundled) |
|
| 30 |
|
| 31 |
---
|