Update README.md
Browse files
README.md
CHANGED
|
@@ -6,11 +6,11 @@ language:
|
|
| 6 |
- de
|
| 7 |
---
|
| 8 |
|
| 9 |
-
**OCRonos** is a series of specialized language model for OCR correction
|
| 10 |
|
| 11 |
-
OCRonos models are trained on a highly diverse set of ocrized texts in multiple languages from PleIAs open pre-training corpus, drawn from cultural heritage sources (Common Corpus) and financial and administrative documents in open data (Finance Commons).
|
| 12 |
|
| 13 |
-
This release
|
| 14 |
|
| 15 |
OCRonos is generally faithful to what the original material, provides sensible restitution of deteriorated text and will rarely rewrite correct words.
|
| 16 |
|
|
|
|
| 6 |
- de
|
| 7 |
---
|
| 8 |
|
| 9 |
+
**OCRonos** is a series of specialized language model for OCR correction.
|
| 10 |
|
| 11 |
+
OCRonos models are trained by PleIAs on a highly diverse set of ocrized texts in multiple languages from PleIAs open pre-training corpus, drawn from cultural heritage sources (Common Corpus) and financial and administrative documents in open data (Finance Commons).
|
| 12 |
|
| 13 |
+
This release currently features a model based on llama-3-8b that has been the most tested to date. Future release will focus on smaller internal models that provides a better ratio of generation cost/quality.
|
| 14 |
|
| 15 |
OCRonos is generally faithful to what the original material, provides sensible restitution of deteriorated text and will rarely rewrite correct words.
|
| 16 |
|