Update README.md
Browse files
README.md
CHANGED
|
@@ -77,10 +77,10 @@ Measured on **NVIDIA A100-SXM4-40GB**, Qwen2.5-VL-7B-Instruct, bfloat16, SDPA at
|
|
| 77 |
|
| 78 |
| Config | Tokens kept | Time | Speedup | Output quality |
|
| 79 |
|---|---|---|---|---|
|
| 80 |
-
| Baseline | 16320 (100%) | 9738ms | 1.00× |
|
| 81 |
-
| SparseVLM 50% | 8192 | 9441ms | 1.03× |
|
| 82 |
-
| SparseVLM 25% | 4080 | 9297ms | 1.05× |
|
| 83 |
-
| SparseVLM 10% | 1632 | 9425ms | 1.03× |
|
| 84 |
|
| 85 |
> **Key result:** Full 4K image (16K tokens) runs without OOM. Without SparseVLM's hook-based scoring, the 16K-token image requires materialising a 15GB attention matrix and crashes. The scorer computes only the text→visual submatrix (35 × 16320 = 32MB instead of 15GB).
|
| 86 |
|
|
@@ -174,10 +174,10 @@ remove_hooks(state)
|
|
| 174 |
|
| 175 |
| Model | Status |
|
| 176 |
|---|---|
|
| 177 |
-
| Qwen/Qwen2.5-VL-7B-Instruct |
|
| 178 |
-
| Qwen/Qwen2.5-VL-3B-Instruct |
|
| 179 |
-
| Qwen/Qwen2.5-VL-72B-Instruct |
|
| 180 |
-
| Qwen/Qwen2-VL-* |
|
| 181 |
|
| 182 |
---
|
| 183 |
|
|
|
|
| 77 |
|
| 78 |
| Config | Tokens kept | Time | Speedup | Output quality |
|
| 79 |
|---|---|---|---|---|
|
| 80 |
+
| Baseline | 16320 (100%) | 9738ms | 1.00× | Identifies Fuji, Milky Way, snow cap, star colors |
|
| 81 |
+
| SparseVLM 50% | 8192 | 9441ms | 1.03× | Same quality |
|
| 82 |
+
| SparseVLM 25% | 4080 | 9297ms | 1.05× | All key details preserved |
|
| 83 |
+
| SparseVLM 10% | 1632 | 9425ms | 1.03× | Still correctly describes scene |
|
| 84 |
|
| 85 |
> **Key result:** Full 4K image (16K tokens) runs without OOM. Without SparseVLM's hook-based scoring, the 16K-token image requires materialising a 15GB attention matrix and crashes. The scorer computes only the text→visual submatrix (35 × 16320 = 32MB instead of 15GB).
|
| 86 |
|
|
|
|
| 174 |
|
| 175 |
| Model | Status |
|
| 176 |
|---|---|
|
| 177 |
+
| Qwen/Qwen2.5-VL-7B-Instruct | Tested |
|
| 178 |
+
| Qwen/Qwen2.5-VL-3B-Instruct | Should work |
|
| 179 |
+
| Qwen/Qwen2.5-VL-72B-Instruct | Should work |
|
| 180 |
+
| Qwen/Qwen2-VL-* | Legacy support |
|
| 181 |
|
| 182 |
---
|
| 183 |
|