Spaces:
Sleeping
Sleeping
| title: Ocr Entropy | |
| emoji: 🏢 | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 6.4.0 | |
| python_version: "3.10" | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| short_description: Calculating the probabilities and entropy of OCR output | |
| # OCR Confidence Visualization | |
| Extract text from document images with token-level confidence visualization. | |
| ## Features | |
| - **Token Streaming**: Watch text appear token-by-token as the model generates | |
| - **Confidence Colors**: Each token is colored based on model confidence: | |
| - Blue (>99%): Very high confidence | |
| - Dark Green (>95%): High confidence | |
| - Light Green (>85%): Good confidence | |
| - Amber (>70%): Moderate confidence | |
| - Red (>50%): Low confidence | |
| - Purple (<=50%): Very low confidence | |
| - **Token Alternatives**: Click any token to see top alternative predictions with probabilities | |
| ## Model | |
| Uses [Nanonets-OCR2-3B](https://huggingface.co/nanonets/Nanonets-OCR2-3B), a Qwen2.5-VL-3B fine-tune optimized for document OCR. | |
| ## Usage | |
| 1. Upload a document image (JPG, PNG, etc.) | |
| 2. Click "Transcribe" | |
| 3. Watch tokens stream with confidence coloring | |
| 4. Click any token to see alternative predictions | |
| ## Technical Details | |
| - Extracts logprobs from each generated token | |
| - Converts logprobs to probabilities via softmax | |
| - Top-k alternatives stored for each token (k=20) | |
| - ZeroGPU compatible for HuggingFace Spaces deployment | |