You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Khasi-OCR-v1

This is a fine-tuned version of DeepSeek-OCR-2 specifically optimized for high-accuracy Optical Character Recognition (OCR) of the Khasi language.

It was trained on a custom dataset of Khasi news articles, official documents, and literature to overcome the common hallucination and repetition issues (such as infinite loops) found in base multimodal models when handling low-resource languages.

Model Highlights

Language Support: Native Khasi (using Latin script with special characters like ï and ñ).
Task: Specialized for "Free OCR" (literal document transcription).
Base Model: DeepSeek-OCR-2.
Accuracy: Significantly lower Character Error Rate (CER) & Word Error Rate (WER)

Usage

To use this model on Kaggle or a local GPU, ensure you have transformers, accelerate, and bitsandbytes installed.

from transformers import AutoModel, AutoTokenizer, BitsAndBytesConfig
import torch

model_id = "toiar/deepseek-ocr-2-khasi-v1"

# Load the model in 4-bit to save memory
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModel.from_pretrained(
    model_id,
    trust_remote_code=True,
    quantization_config=bnb_config,
    device_map="auto"            
)

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

# Set your image path and prompt
image_path = "your_khasi_image.jpg" 
prompt = "<image>\nFree OCR."

# Run the transcription
res = model.infer(
    tokenizer,
    prompt=prompt,
    image_file=image_path,
    output_path="./outputs",
    save_results=False
)

# View results
if isinstance(res, dict) and "text" in res:
    print(res["text"])
else:
    print(res)

Downloads last month: 15

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for toiar/Khasi-OCR-v1

Base model

deepseek-ai/DeepSeek-OCR-2

Finetuned

unsloth/DeepSeek-OCR-2

Finetuned

(1)

this model

toiar
/

Khasi-OCR-v1

You need to agree to share your contact information to access this model

Khasi-OCR-v1

Model Highlights

Usage

Model tree for toiar/Khasi-OCR-v1

Dataset used to train toiar/Khasi-OCR-v1