File size: 3,110 Bytes
83c02c2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
language: en
license: apache-2.0
tags:
- receipt
- document-ai
- layoutlmv3
- token-classification
- ner
- information-extraction
datasets:
- custom-receipt-dataset
metrics:
- accuracy
widget:
- text: "STORE_NAME Date: 2024-01-01 Total: 25.99"
---

# LayoutLMv3 Receipt Parser

A fine-tuned LayoutLMv3 model for extracting structured information from receipt images with **89.34% validation accuracy**.

## Model Details

- **Model**: `albertosei/layoutlmv3-receipt-parser`
- **Architecture**: LayoutLMv3-base
- **Task**: Token Classification (Named Entity Recognition)
- **Languages**: English
- **Training Data**: 1,426 receipt samples
- **Validation**: 100 samples
- **License**: Apache 2.0

## Performance Metrics

| Metric | Value |
|--------|-------|
| **Final Validation Accuracy** | **89.34%** |
| Training Loss (Epoch 1) | 0.6824 |
| Training Loss (Epoch 2) | 0.3278 |
| Validation Accuracy (Epoch 1) | 83.49% |
| Number of Entity Labels | 51 |

## Entity Labels

The model recognizes 25 entity types in BIO format:

### Vendor Information
- `vendor_name` - Store/business name
- `vendor_address` - Physical address
- `vendor_phone_number` - Contact number

### Date & Time
- `date` - Transaction date
- `time` - Transaction time

### Receipt Details
- `receipt_id` - Receipt number/identifier
- `currency` - Currency type

### Financial Amounts
- `total_amount` - Final total
- `subtotal_amount` - Subtotal before tax
- `tax_amount` - Tax amount
- `service_charge_amount` - Service fees
- `discount_amount` - Discounts applied
- `tip_amount` - Tip/gratuity

### Payment Information
- `cash_paid_amount` - Cash payment
- `change_amount` - Change returned
- `credit_card_amount` - Credit card payment
- `e_money_amount` - Electronic payment
- `payment_method` - Payment type

### Line Items
- `line_item_name` - Product/service name
- `line_item_quantity` - Quantity purchased
- `line_item_unit_price` - Price per unit
- `line_item_total_price` - Line item total
- `line_item_discount_amount` - Item-level discount
- `line_item_vat_status` - VAT information

### Other
- `other` - Miscellaneous information

## Usage

```python
from transformers import AutoProcessor, AutoModelForTokenClassification
from PIL import Image
import torch

# Load model and processor
processor = AutoProcessor.from_pretrained("albertosei/layoutlmv3-receipt-parser", apply_ocr=False)
model = AutoModelForTokenClassification.from_pretrained("albertosei/layoutlmv3-receipt-parser")

# Prepare inputs (requires external OCR for text and bounding boxes)
image = Image.open("receipt.jpg").convert("RGB")
words = ["STORE", "NAME", "Date:", "2024-01-01", "Total:", "25.99"]  # From OCR
boxes = [[0, 0, 100, 20], [100, 0, 200, 20], [0, 20, 50, 40], 
         [50, 20, 150, 40], [0, 40, 50, 60], [50, 40, 150, 60]]  # From OCR

# Process and predict
encoding = processor(image, words, boxes=boxes, return_tensors="pt")
with torch.no_grad():
    outputs = model(**encoding)
    predictions = outputs.logits.argmax(-1).squeeze().tolist()

# Convert to labels
predicted_labels = [model.config.id2label[pred] for pred in predictions]