dots.ocr AWQ 4-bit Quantized

This is a 4-bit AWQ quantized version of rednote-hilab/dots.ocr.

Model Details

Base Model: rednote-hilab/dots.ocr
Quantization: W4A16 (4-bit weights, 16-bit activations)
Method: llm-compressor
Size: ~1.5GB (reduced from ~6GB)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, AutoProcessor

model = AutoModelForCausalLM.from_pretrained(
    "sugam24/dots-ocr-awq-4bit",
    trust_remote_code=True,
    device_map="cuda"
)
tokenizer = AutoTokenizer.from_pretrained("sugam24/dots-ocr-awq-4bit", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("sugam24/dots-ocr-awq-4bit", trust_remote_code=True)

License

Same as the base model (Apache 2.0).

Downloads last month: 65

Safetensors

Model size

3B params

Tensor type

I64

I32

F16

Model tree for sugam24/dots-ocr-awq-4bit

Base model

rednote-hilab/dots.ocr

Quantized

(10)

this model