bluecopa
/

dextr-lilt

Token Classification

document-understanding

Model card Files Files and versions

dextr-lilt / README.md

satya007's picture

Upload README.md with huggingface_hub

073f8f9 verified 4 days ago

|

history blame contribute delete

667 Bytes

	---
	license: apache-2.0
	language:
	- en
	- multilingual
	tags:
	- document-understanding
	- token-classification
	- layout
	- lilt
	- receipts
	- invoices
	datasets:
	- bluecopa/dextr-training-data-v3
	---

	# DEXTR-LiLT: Document Extraction with Query-Conditioned Token Classification

	Fine-tuned LiLT model for document field extraction from receipts and invoices.

	## Performance (Holdout Set)

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Macro F1 \| 72.2% \|
	\| Token Accuracy \| 77.0% \|
	\| Table F1 \| 89.2% \|
	\| Row Boundary F1 \| 97.2% \|
	\| Header F1 \| 94.9% \|

	## Training

	- Epochs: 20
	- Batch Size: 24
	- Learning Rate: 2e-5
	- Training Data: ~3000 documents

	## License

	Apache 2.0