Spaces:

ryandt
/

ocr-entropy

Sleeping

ocr-entropy / README.md

Update README.md

2656c14 verified about 1 month ago

1.38 kB

	---
	title: Ocr Entropy
	emoji: 🏢
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 6.4.0
	python_version: "3.10"
	app_file: app.py
	pinned: false
	license: mit
	short_description: Calculating the probabilities and entropy of OCR output
	---

	# OCR Confidence Visualization

	Extract text from document images with token-level confidence visualization.

	## Features

	- Token Streaming: Watch text appear token-by-token as the model generates
	- Confidence Colors: Each token is colored based on model confidence:
	- Blue (>99%): Very high confidence
	- Dark Green (>95%): High confidence
	- Light Green (>85%): Good confidence
	- Amber (>70%): Moderate confidence
	- Red (>50%): Low confidence
	- Purple (<=50%): Very low confidence
	- Token Alternatives: Click any token to see top alternative predictions with probabilities

	## Model

	Uses [Nanonets-OCR2-3B](https://huggingface.co/nanonets/Nanonets-OCR2-3B), a Qwen2.5-VL-3B fine-tune optimized for document OCR.

	## Usage

	1. Upload a document image (JPG, PNG, etc.)
	2. Click "Transcribe"
	3. Watch tokens stream with confidence coloring
	4. Click any token to see alternative predictions

	## Technical Details

	- Extracts logprobs from each generated token
	- Converts logprobs to probabilities via softmax
	- Top-k alternatives stored for each token (k=20)
	- ZeroGPU compatible for HuggingFace Spaces deployment