Update README.md
Browse files
README.md
CHANGED
|
@@ -173,6 +173,10 @@ Faust-1 uses a custom tokenizer optimized for German morphology and compounding.
|
|
| 173 |
|
| 174 |
Lower token counts on German text translate directly into more usable context, lower inference cost, and less fragmentation on compound-heavy inputs.
|
| 175 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 176 |
---
|
| 177 |
|
| 178 |
## German benchmark performance
|
|
|
|
| 173 |
|
| 174 |
Lower token counts on German text translate directly into more usable context, lower inference cost, and less fragmentation on compound-heavy inputs.
|
| 175 |
|
| 176 |
+
|
| 177 |
+
<img src="tokenizer_faust.png" alt="Faust-1 vs OpenAI Tokenizers" width="800">
|
| 178 |
+
|
| 179 |
+
|
| 180 |
---
|
| 181 |
|
| 182 |
## German benchmark performance
|