--- license: apache-2.0 base_model: rednote-hilab/dots.mocr library_name: transformers pipeline_tag: image-text-to-text tags: - fp8 - compressed-tensors - llm-compressor - quantized - multimodal --- # dots.mocr-FP8 FP8-quantized version of [`rednote-hilab/dots.mocr`](https://huggingface.co/rednote-hilab/dots.mocr). This model was quantized with [`llm-compressor`](https://github.com/vllm-project/llm-compressor) using FP8 dynamic activation quantization for the text backbone. The custom vision tower was intentionally excluded from quantization and kept in BF16. ## Quantization details - **Base model:** `rednote-hilab/dots.mocr` - **Quantization tool:** `llm-compressor` - **Saved format:** `compressed-tensors` - **Quantization scheme:** `FP8_DYNAMIC` - **Targets:** `Linear` - **Ignored modules:** - `lm_head` - `.*vision_tower.*` ## Quantization recipe ```python from llmcompressor import oneshot from llmcompressor.modifiers.quantization import QuantizationModifier recipe = QuantizationModifier( targets="Linear", scheme="FP8_DYNAMIC", ignore=[ "lm_head", "re:.*vision_tower.*", ], ) oneshot(model=model, recipe=recipe) model.save_pretrained("binedge/dots.mocr-FP8", save_compressed=True) processor.save_pretrained("binedge/dots.mocr-FP8") ```