albertosei
/

layoutlmv3-receipt-parser

Token Classification

information-extraction

Model card Files Files and versions

layoutlmv3-receipt-parser / README.md

albertosei's picture

Create README.md

83c02c2 verified 3 months ago

|

history blame contribute delete

3.11 kB

	---
	language: en
	license: apache-2.0
	tags:
	- receipt
	- document-ai
	- layoutlmv3
	- token-classification
	- ner
	- information-extraction
	datasets:
	- custom-receipt-dataset
	metrics:
	- accuracy
	widget:
	- text: "STORE_NAME Date: 2024-01-01 Total: 25.99"
	---

	# LayoutLMv3 Receipt Parser

	A fine-tuned LayoutLMv3 model for extracting structured information from receipt images with 89.34% validation accuracy.

	## Model Details

	- Model: `albertosei/layoutlmv3-receipt-parser`
	- Architecture: LayoutLMv3-base
	- Task: Token Classification (Named Entity Recognition)
	- Languages: English
	- Training Data: 1,426 receipt samples
	- Validation: 100 samples
	- License: Apache 2.0

	## Performance Metrics

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Final Validation Accuracy \| 89.34% \|
	\| Training Loss (Epoch 1) \| 0.6824 \|
	\| Training Loss (Epoch 2) \| 0.3278 \|
	\| Validation Accuracy (Epoch 1) \| 83.49% \|
	\| Number of Entity Labels \| 51 \|

	## Entity Labels

	The model recognizes 25 entity types in BIO format:

	### Vendor Information
	- `vendor_name` - Store/business name
	- `vendor_address` - Physical address
	- `vendor_phone_number` - Contact number

	### Date & Time
	- `date` - Transaction date
	- `time` - Transaction time

	### Receipt Details
	- `receipt_id` - Receipt number/identifier
	- `currency` - Currency type

	### Financial Amounts
	- `total_amount` - Final total
	- `subtotal_amount` - Subtotal before tax
	- `tax_amount` - Tax amount
	- `service_charge_amount` - Service fees
	- `discount_amount` - Discounts applied
	- `tip_amount` - Tip/gratuity

	### Payment Information
	- `cash_paid_amount` - Cash payment
	- `change_amount` - Change returned
	- `credit_card_amount` - Credit card payment
	- `e_money_amount` - Electronic payment
	- `payment_method` - Payment type

	### Line Items
	- `line_item_name` - Product/service name
	- `line_item_quantity` - Quantity purchased
	- `line_item_unit_price` - Price per unit
	- `line_item_total_price` - Line item total
	- `line_item_discount_amount` - Item-level discount
	- `line_item_vat_status` - VAT information

	### Other
	- `other` - Miscellaneous information

	## Usage

	```python
	from transformers import AutoProcessor, AutoModelForTokenClassification
	from PIL import Image
	import torch

	# Load model and processor
	processor = AutoProcessor.from_pretrained("albertosei/layoutlmv3-receipt-parser", apply_ocr=False)
	model = AutoModelForTokenClassification.from_pretrained("albertosei/layoutlmv3-receipt-parser")

	# Prepare inputs (requires external OCR for text and bounding boxes)
	image = Image.open("receipt.jpg").convert("RGB")
	words = ["STORE", "NAME", "Date:", "2024-01-01", "Total:", "25.99"] # From OCR
	boxes = [[0, 0, 100, 20], [100, 0, 200, 20], [0, 20, 50, 40],
	[50, 20, 150, 40], [0, 40, 50, 60], [50, 40, 150, 60]] # From OCR

	# Process and predict
	encoding = processor(image, words, boxes=boxes, return_tensors="pt")
	with torch.no_grad():
	outputs = model(**encoding)
	predictions = outputs.logits.argmax(-1).squeeze().tolist()

	# Convert to labels
	predicted_labels = [model.config.id2label[pred] for pred in predictions]