---
license: apache-2.0
base_model: rednote-hilab/dots.mocr
library_name: transformers
pipeline_tag: image-text-to-text
tags:
  - fp8
  - compressed-tensors
  - llm-compressor
  - quantized
  - multimodal
---

# dots.mocr-FP8

FP8-quantized version of [`rednote-hilab/dots.mocr`](https://huggingface.co/rednote-hilab/dots.mocr).

This model was quantized with [`llm-compressor`](https://github.com/vllm-project/llm-compressor) using FP8 dynamic activation quantization for the text backbone. The custom vision tower was intentionally excluded from quantization and kept in BF16.

## Quantization details

- **Base model:** `rednote-hilab/dots.mocr`
- **Quantization tool:** `llm-compressor`
- **Saved format:** `compressed-tensors`
- **Quantization scheme:** `FP8_DYNAMIC`
- **Targets:** `Linear`
- **Ignored modules:**
  - `lm_head`
  - `.*vision_tower.*`

## Quantization recipe

```python
from llmcompressor import oneshot
from llmcompressor.modifiers.quantization import QuantizationModifier

recipe = QuantizationModifier(
    targets="Linear",
    scheme="FP8_DYNAMIC",
    ignore=[
        "lm_head",
        "re:.*vision_tower.*",
    ],
)

oneshot(model=model, recipe=recipe)

model.save_pretrained("binedge/dots.mocr-FP8", save_compressed=True)
processor.save_pretrained("binedge/dots.mocr-FP8")
```