Update README.md
Browse files
README.md
CHANGED
|
@@ -8,13 +8,12 @@ pipeline_tag: text-generation
|
|
| 8 |
|
| 9 |
**Emendator** is a [byt5-xl](https://huggingface.co/google/byt5-xl) model finetuned to correct OCR artifacts in Latin text.
|
| 10 |
|
| 11 |
-
**This model cannot provide completely faithful reconstruction for all orthographies - on a large scale, it will shift the distribution of tokens towards
|
| 12 |
This is to say that **Emendator will take editorial liberties with your data.** It is fond of introducing abbreviations. As such, use it only in circumstances when the primary concern is only to recover intelligible Latin, not to recover intelligible Latin of a *particular* style.
|
| 13 |
|
| 14 |
|
| 15 |
The model is intended to be used on segments of **250** characters. Anything else will compromise performance.
|
| 16 |
|
| 17 |
-
|
| 18 |
### Lightly Corrupted Text
|
| 19 |
Original: "atque optimo viro, peterem; superavi tamen dignitate Catilinam, gratia Galbam. Quod si id crimen homini novo esse deberet, profecto",
|
| 20 |
|
|
|
|
| 8 |
|
| 9 |
**Emendator** is a [byt5-xl](https://huggingface.co/google/byt5-xl) model finetuned to correct OCR artifacts in Latin text.
|
| 10 |
|
| 11 |
+
**This model cannot provide completely faithful reconstruction for all orthographies - on a large scale, it will shift the distribution of tokens towards that which it has been trained on.**
|
| 12 |
This is to say that **Emendator will take editorial liberties with your data.** It is fond of introducing abbreviations. As such, use it only in circumstances when the primary concern is only to recover intelligible Latin, not to recover intelligible Latin of a *particular* style.
|
| 13 |
|
| 14 |
|
| 15 |
The model is intended to be used on segments of **250** characters. Anything else will compromise performance.
|
| 16 |
|
|
|
|
| 17 |
### Lightly Corrupted Text
|
| 18 |
Original: "atque optimo viro, peterem; superavi tamen dignitate Catilinam, gratia Galbam. Quod si id crimen homini novo esse deberet, profecto",
|
| 19 |
|