AINovice2005 commited on
Commit
9b31e31
·
verified ·
1 Parent(s): 47d1351

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -1
README.md CHANGED
@@ -3,4 +3,44 @@ base_model:
3
  - THUDM/GLM-4.1V-9B-Thinking
4
  pipeline_tag: image-text-to-text
5
  library_name: transformers
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  - THUDM/GLM-4.1V-9B-Thinking
4
  pipeline_tag: image-text-to-text
5
  library_name: transformers
6
+ ---
7
+
8
+ **GLM‑4.1V‑9B‑Thinking • Quantized**
9
+ ---
10
+
11
+ ### 🚀 Model Description
12
+
13
+ This is a **quantized version** of **GLM‑4.1V‑9B‑Thinking**, a powerful 9B‑parameter vision‑language model using the “thinking paradigm” and
14
+ reinforced reasoning. The quantization enables significantly lighter memory usage and faster inference on consumer-grade GPUs while
15
+ preserving its strong performance on multimodal reasoning tasks.
16
+
17
+ ---
18
+
19
+ ## Quantization Details
20
+
21
+ **Method**: torchao quantization
22
+ **Weight Precision**: int8
23
+ **Activation Precision**: int8 dynamic
24
+ **Technique**: Symmetric mapping
25
+ **Impact**: Significant reduction in model size with minimal loss in reasoning, coding, and general instruction-following capabilities.
26
+
27
+ ---
28
+
29
+ ### 🎯 Intended Use
30
+
31
+ Perfect for:
32
+
33
+ * Vision‑language applications with long contexts and heavy reasoning
34
+ * On-device or low-VRAM inference for tempo‑sensitive environments
35
+ * Challenging multimodal tasks: image Q\&A, reasoning over diagrams, high-resolution visual analysis
36
+ * Research into quantized vision‑language deployment
37
+
38
+ ---
39
+
40
+ ### ⚠️ Limitations
41
+
42
+ * Minor drop in detailed reasoning accuracy vs full-precision
43
+ * Maintains original model’s general LLM caveats: hallucinations, bias, and prompting sensitivity
44
+
45
+ ---
46
+