Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,110 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
tags:
|
| 5 |
+
- receipt
|
| 6 |
+
- document-ai
|
| 7 |
+
- layoutlmv3
|
| 8 |
+
- token-classification
|
| 9 |
+
- ner
|
| 10 |
+
- information-extraction
|
| 11 |
+
datasets:
|
| 12 |
+
- custom-receipt-dataset
|
| 13 |
+
metrics:
|
| 14 |
+
- accuracy
|
| 15 |
+
widget:
|
| 16 |
+
- text: "STORE_NAME Date: 2024-01-01 Total: 25.99"
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# LayoutLMv3 Receipt Parser
|
| 20 |
+
|
| 21 |
+
A fine-tuned LayoutLMv3 model for extracting structured information from receipt images with **89.34% validation accuracy**.
|
| 22 |
+
|
| 23 |
+
## Model Details
|
| 24 |
+
|
| 25 |
+
- **Model**: `albertosei/layoutlmv3-receipt-parser`
|
| 26 |
+
- **Architecture**: LayoutLMv3-base
|
| 27 |
+
- **Task**: Token Classification (Named Entity Recognition)
|
| 28 |
+
- **Languages**: English
|
| 29 |
+
- **Training Data**: 1,426 receipt samples
|
| 30 |
+
- **Validation**: 100 samples
|
| 31 |
+
- **License**: Apache 2.0
|
| 32 |
+
|
| 33 |
+
## Performance Metrics
|
| 34 |
+
|
| 35 |
+
| Metric | Value |
|
| 36 |
+
|--------|-------|
|
| 37 |
+
| **Final Validation Accuracy** | **89.34%** |
|
| 38 |
+
| Training Loss (Epoch 1) | 0.6824 |
|
| 39 |
+
| Training Loss (Epoch 2) | 0.3278 |
|
| 40 |
+
| Validation Accuracy (Epoch 1) | 83.49% |
|
| 41 |
+
| Number of Entity Labels | 51 |
|
| 42 |
+
|
| 43 |
+
## Entity Labels
|
| 44 |
+
|
| 45 |
+
The model recognizes 25 entity types in BIO format:
|
| 46 |
+
|
| 47 |
+
### Vendor Information
|
| 48 |
+
- `vendor_name` - Store/business name
|
| 49 |
+
- `vendor_address` - Physical address
|
| 50 |
+
- `vendor_phone_number` - Contact number
|
| 51 |
+
|
| 52 |
+
### Date & Time
|
| 53 |
+
- `date` - Transaction date
|
| 54 |
+
- `time` - Transaction time
|
| 55 |
+
|
| 56 |
+
### Receipt Details
|
| 57 |
+
- `receipt_id` - Receipt number/identifier
|
| 58 |
+
- `currency` - Currency type
|
| 59 |
+
|
| 60 |
+
### Financial Amounts
|
| 61 |
+
- `total_amount` - Final total
|
| 62 |
+
- `subtotal_amount` - Subtotal before tax
|
| 63 |
+
- `tax_amount` - Tax amount
|
| 64 |
+
- `service_charge_amount` - Service fees
|
| 65 |
+
- `discount_amount` - Discounts applied
|
| 66 |
+
- `tip_amount` - Tip/gratuity
|
| 67 |
+
|
| 68 |
+
### Payment Information
|
| 69 |
+
- `cash_paid_amount` - Cash payment
|
| 70 |
+
- `change_amount` - Change returned
|
| 71 |
+
- `credit_card_amount` - Credit card payment
|
| 72 |
+
- `e_money_amount` - Electronic payment
|
| 73 |
+
- `payment_method` - Payment type
|
| 74 |
+
|
| 75 |
+
### Line Items
|
| 76 |
+
- `line_item_name` - Product/service name
|
| 77 |
+
- `line_item_quantity` - Quantity purchased
|
| 78 |
+
- `line_item_unit_price` - Price per unit
|
| 79 |
+
- `line_item_total_price` - Line item total
|
| 80 |
+
- `line_item_discount_amount` - Item-level discount
|
| 81 |
+
- `line_item_vat_status` - VAT information
|
| 82 |
+
|
| 83 |
+
### Other
|
| 84 |
+
- `other` - Miscellaneous information
|
| 85 |
+
|
| 86 |
+
## Usage
|
| 87 |
+
|
| 88 |
+
```python
|
| 89 |
+
from transformers import AutoProcessor, AutoModelForTokenClassification
|
| 90 |
+
from PIL import Image
|
| 91 |
+
import torch
|
| 92 |
+
|
| 93 |
+
# Load model and processor
|
| 94 |
+
processor = AutoProcessor.from_pretrained("albertosei/layoutlmv3-receipt-parser", apply_ocr=False)
|
| 95 |
+
model = AutoModelForTokenClassification.from_pretrained("albertosei/layoutlmv3-receipt-parser")
|
| 96 |
+
|
| 97 |
+
# Prepare inputs (requires external OCR for text and bounding boxes)
|
| 98 |
+
image = Image.open("receipt.jpg").convert("RGB")
|
| 99 |
+
words = ["STORE", "NAME", "Date:", "2024-01-01", "Total:", "25.99"] # From OCR
|
| 100 |
+
boxes = [[0, 0, 100, 20], [100, 0, 200, 20], [0, 20, 50, 40],
|
| 101 |
+
[50, 20, 150, 40], [0, 40, 50, 60], [50, 40, 150, 60]] # From OCR
|
| 102 |
+
|
| 103 |
+
# Process and predict
|
| 104 |
+
encoding = processor(image, words, boxes=boxes, return_tensors="pt")
|
| 105 |
+
with torch.no_grad():
|
| 106 |
+
outputs = model(**encoding)
|
| 107 |
+
predictions = outputs.logits.argmax(-1).squeeze().tolist()
|
| 108 |
+
|
| 109 |
+
# Convert to labels
|
| 110 |
+
predicted_labels = [model.config.id2label[pred] for pred in predictions]
|