dots.ocr AWQ 4-bit Quantized

This is a 4-bit AWQ quantized version of rednote-hilab/dots.ocr.

Model Details

  • Base Model: rednote-hilab/dots.ocr
  • Quantization: W4A16 (4-bit weights, 16-bit activations)
  • Method: llm-compressor
  • Size: ~1.5GB (reduced from ~6GB)

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, AutoProcessor

model = AutoModelForCausalLM.from_pretrained(
    "sugam24/dots-ocr-awq-4bit",
    trust_remote_code=True,
    device_map="cuda"
)
tokenizer = AutoTokenizer.from_pretrained("sugam24/dots-ocr-awq-4bit", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("sugam24/dots-ocr-awq-4bit", trust_remote_code=True)

License

Same as the base model (Apache 2.0).

Downloads last month
62
Safetensors
Model size
0.8B params
Tensor type
I64
I32
F16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for sugam24/dots-ocr-awq-4bit

Quantized
(4)
this model