dots.mocr-FP8 / README.md
francis2tm's picture
Add model card
2be26bf verified
---
license: apache-2.0
base_model: rednote-hilab/dots.mocr
library_name: transformers
pipeline_tag: image-text-to-text
tags:
- fp8
- compressed-tensors
- llm-compressor
- quantized
- multimodal
---
# dots.mocr-FP8
FP8-quantized version of [`rednote-hilab/dots.mocr`](https://huggingface.co/rednote-hilab/dots.mocr).
This model was quantized with [`llm-compressor`](https://github.com/vllm-project/llm-compressor) using FP8 dynamic activation quantization for the text backbone. The custom vision tower was intentionally excluded from quantization and kept in BF16.
## Quantization details
- **Base model:** `rednote-hilab/dots.mocr`
- **Quantization tool:** `llm-compressor`
- **Saved format:** `compressed-tensors`
- **Quantization scheme:** `FP8_DYNAMIC`
- **Targets:** `Linear`
- **Ignored modules:**
- `lm_head`
- `.*vision_tower.*`
## Quantization recipe
```python
from llmcompressor import oneshot
from llmcompressor.modifiers.quantization import QuantizationModifier
recipe = QuantizationModifier(
targets="Linear",
scheme="FP8_DYNAMIC",
ignore=[
"lm_head",
"re:.*vision_tower.*",
],
)
oneshot(model=model, recipe=recipe)
model.save_pretrained("binedge/dots.mocr-FP8", save_compressed=True)
processor.save_pretrained("binedge/dots.mocr-FP8")
```