dextr-lilt / README.md
satya007's picture
Upload README.md with huggingface_hub
073f8f9 verified
---
license: apache-2.0
language:
- en
- multilingual
tags:
- document-understanding
- token-classification
- layout
- lilt
- receipts
- invoices
datasets:
- bluecopa/dextr-training-data-v3
---
# DEXTR-LiLT: Document Extraction with Query-Conditioned Token Classification
Fine-tuned LiLT model for document field extraction from receipts and invoices.
## Performance (Holdout Set)
| Metric | Score |
|--------|-------|
| Macro F1 | 72.2% |
| Token Accuracy | 77.0% |
| Table F1 | 89.2% |
| Row Boundary F1 | 97.2% |
| Header F1 | 94.9% |
## Training
- Epochs: 20
- Batch Size: 24
- Learning Rate: 2e-5
- Training Data: ~3000 documents
## License
Apache 2.0