Spaces:

sammoftah
/

document-understanding-ocr

Sleeping

App Files Files Community

document-understanding-ocr / README.md

sammoftah

Deploy Document Understanding OCR

e5ee651 verified 18 days ago

preview code

raw

history blame contribute delete

1.94 kB

	---
	title: Document Understanding OCR
	emoji: 📄
	colorFrom: yellow
	colorTo: blue
	sdk: docker
	pinned: false
	license: mit
	---

	# Document Understanding OCR

	## Question

	After OCR has produced text, how do we recover a structured document schema?

	## System Boundary

	This Streamlit Space demonstrates the post-OCR layer for invoices: field extraction, confidence scoring, line-item parsing, validation, and JSON export.

	## Method

	The app applies transparent extraction patterns to OCR text, computes field-level confidence, parses line items, and compares extracted fields against a review threshold.

	## Technique

	This is schema extraction after OCR. Raw text is mapped into named fields, and each field gets a confidence signal.

	The method is intentionally transparent: field patterns are visible and the review threshold controls which fields require human attention.

	## Output

	The app returns a field table, line-item table, confidence chart, review queue, and JSON payload.

	## Why It Matters

	Document AI becomes useful when extraction is inspectable. A human reviewer should know which fields were found, which were uncertain, and what JSON would be sent downstream.

	## What To Notice

	Field-level confidence is more actionable than a single document score. A document can be mostly correct while one critical field, such as total or due date, is wrong.

	## Effect In Practice

	This pattern supports invoice processing, procurement workflows, form extraction, and human review queues.

	## Hugging Face Extension

	The Space can add document-image OCR with TrOCR, Donut, LayoutLM, or a vision-language model and evaluate field-level extraction accuracy.

	## Limitations

	This version starts from OCR text. A full system should add image-to-text OCR or document VLM inference, table recognition, multilingual support, and labeled evaluation.

	## Run Locally

	```bash
	pip install -r requirements.txt
	streamlit run app.py
	```