Model Card β€” sapkotapraful/FullyOCR (finetuned)

  • Developed by: sapkotapraful
  • License: apache-2.0
  • Model: sapkotapraful/FullyOCR
  • Framework: Unsloth (FastVisionModel) + PyTorch

Short description

  • FullyOCR is a vision-language OCR model finetuned for extracting text and structured content from images and PDFs. It is intended for research, prototyping, and non-critical document extraction tasks.

Intended use

  • OCR/text extraction from images and scanned documents.
  • Not for automated medical, legal, or safety-critical decisions without human review.

How to load (using Unsloth; no external API calls)

  • Minimal local loading and inference example. Adjust device/quantization flags as needed.
from unsloth import FastVisionModel
import torch
from PIL import Image

# Load model + tokenizer (example uses 4-bit quantization if applicable)
model, tokenizer = FastVisionModel.from_pretrained(
    "sapkotapraful/FullyOCR",
    load_in_4bit=True,
)

model.eval()
device = "cuda" if torch.cuda.is_available() else "cpu"
if device == "cuda":
    model = model.to(device)

# Instruction token used during finetuning
instruction = "<|MD|>"

# Prepare messages in training-time template
messages = [
    {"role": "user", "content": [
        {"type": "image"},
        {"type": "text", "text": instruction}
    ]}
]

input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)

# [image](http://_vscodecontentref_/0) is a PIL.Image in RGB mode
# tokenizer returns tensors suitable for model.generate
inputs = tokenizer(
    image,               # PIL.Image object
    input_text,
    add_special_tokens=False,
    return_tensors="pt",
).to(device)

with torch.no_grad(), torch.amp.autocast(device_type="cuda", enabled=(device=="cuda")):
    output_ids = model.generate(
        **inputs,
        max_new_tokens=1024,
        use_cache=True,
        num_beams=1,
        do_sample=False,
        pad_token_id=tokenizer.pad_token_id,
    )

decoded = tokenizer.batch_decode(output_ids, skip_special_tokens=True)[0]
extracted = decoded.split(instruction)[-1].strip()
print(extracted)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sapkotapraful/FullyOCR

Unable to build the model tree, the base model loops to the model itself. Learn more.