simon-clmtd commited on
Commit
fb6f7ac
·
verified ·
1 Parent(s): 026ec3a
Files changed (1) hide show
  1. app.py +5 -6
app.py CHANGED
@@ -98,14 +98,13 @@ with gr.Blocks(title="OCR QA Demo") as demo:
98
  """
99
  # 🔍 Optical Character Recognition (OCR) Quality Assessment Demo
100
 
101
- The demo showcases how the [Impresso Project](https://impresso-project.ch) assesses the quality of ORC transcripts by estimating the proportion of (un)known words with respect to a large clean text corpus.
 
 
 
102
 
103
- It returns:
104
- - a list of **potential OCR errors** (unrecognized unique tokens) as well as the known unique tokens, and
105
- - a **quality score** between 0.0 (poor) and 1.0 (excellent) computed as `score = known/(known + unknown)`
106
 
107
-
108
- Try the example below (a German text with typical OCR errors), or paste your own OCR-processed text to assess its quality.
109
  """
110
  )
111
 
 
98
  """
99
  # 🔍 Optical Character Recognition (OCR) Quality Assessment Demo
100
 
101
+ This demo evaluates OCR quality by comparing the unique words in a text against large reference vocabularies.
102
+ It reports:
103
+ - **potential OCR errors** (unrecognized unique tokens) and known tokens
104
+ - an overall **quality score** between 0.0 (poor) and 1.0 (perfect), defined as `score = known/(known + unknown)`
105
 
 
 
 
106
 
107
+ Try the German example below or paste your own OCR text.
 
108
  """
109
  )
110