| | --- |
| | language: |
| | - spa |
| | - en |
| | pipeline-tag: image-classification |
| | --- |
| | **InvoiceReceiptClassifier** is a fine-tuned LayoutLMv2 model that classifies a document to an invoice or receipt. |
| |
|
| | ## Quick start: using the raw model |
| |
|
| | ```python |
| | from transformers import ( |
| | AutoModelForSequenceClassification, |
| | LayoutLMv2FeatureExtractor, |
| | LayoutLMv2Tokenizer, |
| | LayoutLMv2Processor, |
| | ) |
| | model = AutoModelForSequenceClassification.from_pretrained("fedihch/InvoiceReceiptClassifier") |
| | feature_extractor = LayoutLMv2FeatureExtractor() |
| | tokenizer = LayoutLMv2Tokenizer.from_pretrained("microsoft/layoutlmv2-base-uncased") |
| | processor = LayoutLMv2Processor(feature_extractor, tokenizer) |
| | ``` |
| | ```python |
| | from PIL import Image |
| | input_img = Image.open("*****.jpg") |
| | w, h = input_img.size |
| | input_img = input_img.convert("RGB").resize((int(w * 600 / h), 600)) |
| | encoded_inputs = processor(input_img, return_tensors="pt") |
| | for k, v in encoded_inputs.items(): |
| | encoded_inputs[k] = v.to(model.device) |
| | outputs = model(**encoded_inputs) |
| | logits = outputs.logits |
| | predicted_class_idx = logits.argmax(-1).item() |
| | id2label = {0: "invoice", 1: "receipt"} |
| | print(id2label[predicted_class_idx]) |
| | ``` |
| |
|
| |
|