AINovice2005
/

quantized-GLM-4.1V-9B-Thinking

Image-Text-to-Text

Model card Files Files and versions

AINovice2005 commited on Jul 17, 2025

Commit

9b31e31

·

verified ·

1 Parent(s): 47d1351

Update README.md

Files changed (1) hide show

README.md +41 -1

README.md CHANGED Viewed

@@ -3,4 +3,44 @@ base_model:
 - THUDM/GLM-4.1V-9B-Thinking
 pipeline_tag: image-text-to-text
 library_name: transformers
----

 - THUDM/GLM-4.1V-9B-Thinking
 pipeline_tag: image-text-to-text
 library_name: transformers
+---
+**GLM‑4.1V‑9B‑Thinking • Quantized**
+---
+### 🚀 Model Description
+This is a **quantized version** of **GLM‑4.1V‑9B‑Thinking**, a powerful 9B‑parameter vision‑language model using the “thinking paradigm” and
+reinforced reasoning. The quantization enables significantly lighter memory usage and faster inference on consumer-grade GPUs while
+preserving its strong performance on multimodal reasoning tasks.
+---
+## Quantization Details
+**Method**: torchao quantization
+**Weight Precision**: int8
+**Activation Precision**: int8 dynamic
+**Technique**: Symmetric mapping
+**Impact**: Significant reduction in model size with minimal loss in reasoning, coding, and general instruction-following capabilities.
+---
+### 🎯 Intended Use
+Perfect for:
+* Vision‑language applications with long contexts and heavy reasoning
+* On-device or low-VRAM inference for tempo‑sensitive environments
+* Challenging multimodal tasks: image Q\&A, reasoning over diagrams, high-resolution visual analysis
+* Research into quantized vision‑language deployment
+---
+### ⚠️ Limitations
+* Minor drop in detailed reasoning accuracy vs full-precision
+* Maintains original model’s general LLM caveats: hallucinations, bias, and prompting sensitivity
+---