callmeeric5's picture
Update README.md
246f852 verified
metadata
license: apache-2.0
datasets:
  - mychen76/invoices-and-receipts_ocr_v1
language:
  - en

Intro

The model* is fine-tuned on Qwen2.5-3B-VL using a dataset of invoices and receipts. It can be used to extract text from the input and return the output in a specified JSon format.

*It is already merged with the LoRA layer and the original model. Be mindful of the input size to avoid a CUDA out-of-memory error.

Here is an example notebook of inference

For the LoRA params only, go to this repo

Usage:

from transformers import AutoModelForVision2Seq, AutoProcessor, AutoTokenizer

model = AutoModelForVision2Seq.from_pretrained(
    "callmeeric5/Qwen3B-Invoice-Receipt",
    device_map="cuda", #auto
    torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained("callmeeric5/Qwen-3B-Invoice-Receipt")
processor = AutoProcessor.from_pretrained("callmeeric5/Qwen-3B-Invoice-Receipt")