File size: 667 Bytes
073f8f9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
---
license: apache-2.0
language:
- en
- multilingual
tags:
- document-understanding
- token-classification
- layout
- lilt
- receipts
- invoices
datasets:
- bluecopa/dextr-training-data-v3
---
# DEXTR-LiLT: Document Extraction with Query-Conditioned Token Classification
Fine-tuned LiLT model for document field extraction from receipts and invoices.
## Performance (Holdout Set)
| Metric | Score |
|--------|-------|
| Macro F1 | 72.2% |
| Token Accuracy | 77.0% |
| Table F1 | 89.2% |
| Row Boundary F1 | 97.2% |
| Header F1 | 94.9% |
## Training
- Epochs: 20
- Batch Size: 24
- Learning Rate: 2e-5
- Training Data: ~3000 documents
## License
Apache 2.0
|