Spaces:

impresso-project
/

ocrqa-demo

Running

simon-clmtd commited on Dec 5, 2025

Commit

fb6f7ac

verified ·

1 Parent(s): 026ec3a

simplify

Files changed (1) hide show

app.py CHANGED Viewed

@@ -98,14 +98,13 @@ with gr.Blocks(title="OCR QA Demo") as demo:
         """
     # 🔍 Optical Character Recognition (OCR) Quality Assessment Demo
-    The demo showcases how the [Impresso Project](https://impresso-project.ch) assesses the quality of ORC transcripts by estimating the proportion of (un)known words with respect to a large clean text corpus.
-    It returns:
-    - a list of **potential OCR errors** (unrecognized unique tokens) as well as the known unique tokens, and
-    - a **quality score** between 0.0 (poor) and 1.0 (excellent) computed as `score = known/(known + unknown)`
-    Try the example below (a German text with typical OCR errors), or paste your own OCR-processed text to assess its quality.
     """
     )

         """
     # 🔍 Optical Character Recognition (OCR) Quality Assessment Demo
+    This demo evaluates OCR quality by comparing the unique words in a text against large reference vocabularies.
+    It reports:
+    - **potential OCR errors** (unrecognized unique tokens) and known tokens
+    - an overall **quality score** between 0.0 (poor) and 1.0 (perfect), defined as `score = known/(known + unknown)`
+    Try the German example below or paste your own OCR text.
     """
     )