Update README.md
Browse files
README.md
CHANGED
|
@@ -9,7 +9,7 @@ pipeline_tag: text-generation
|
|
| 9 |
**Emendator** is a [byt5-xl](https://huggingface.co/google/byt5-xl) model finetuned to correct OCR artifacts in Latin text.
|
| 10 |
|
| 11 |
**This model cannot provide completely faithful reconstruction for all orthographies - on a large scale, it will shift the distribution of tokens towards that which it has been trained on.**
|
| 12 |
-
This is to say: **Emendator will take editorial liberties with your data
|
| 13 |
|
| 14 |
|
| 15 |
The model is intended to be used on segments of **250** characters. Anything else will compromise performance.
|
|
|
|
| 9 |
**Emendator** is a [byt5-xl](https://huggingface.co/google/byt5-xl) model finetuned to correct OCR artifacts in Latin text.
|
| 10 |
|
| 11 |
**This model cannot provide completely faithful reconstruction for all orthographies - on a large scale, it will shift the distribution of tokens towards that which it has been trained on.**
|
| 12 |
+
This is to say: **Emendator will take editorial liberties with your data.** It is fond of introducing abbreviations. As such, use it only in circumstances when the primary concern is only to recover intelligible Latin, not to recover intelligible Latin of a *particular* style.
|
| 13 |
|
| 14 |
|
| 15 |
The model is intended to be used on segments of **250** characters. Anything else will compromise performance.
|