sayed0am
/

Nanonets-OCR2-3B-FP8-Dynamic

Image-Text-to-Text

text-generation-inference

compressed-tensors

Model card Files Files and versions

sayed0am commited on Dec 9, 2025

Commit

c987b62

·

verified ·

1 Parent(s): cd7f9d9

Create README.md

Files changed (1) hide show

README.md +44 -0

README.md ADDED Viewed

	@@ -0,0 +1,44 @@

+---
+language:
+- multilingual
+base_model:
+- nanonets/Nanonets-OCR2-3B
+tags:
+- OCR
+- image-to-text
+- pdf2markdown
+- VQA
+pipeline_tag: image-text-to-text
+library_name: transformers
+---
+Creation Code
+```python
+from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
+from llmcompressor import oneshot
+from llmcompressor.modifiers.quantization import QuantizationModifier
+from llmcompressor.utils import dispatch_for_generation
+MODEL_ID = "nanonets/Nanonets-OCR2-3B"
+# Load model.
+model = Qwen2_5_VLForConditionalGeneration.from_pretrained(MODEL_ID, torch_dtype="auto")
+processor = AutoProcessor.from_pretrained(MODEL_ID)
+# Configure the quantization algorithm and scheme.
+# In this case, we:
+#   * quantize the weights to fp8 with per channel via ptq
+#   * quantize the activations to fp8 with dynamic per token
+recipe = QuantizationModifier(
+    targets="Linear",
+    scheme="FP8_DYNAMIC",
+    ignore=["lm_head", "re:visual.*", "re:model.visual.*"],
+)
+# Apply quantization and save to disk in compressed-tensors format.
+oneshot(model=model, recipe=recipe)
+```