Fix ColPali claim; add single-vector + storage highlights
Browse files
README.md
CHANGED
|
@@ -68,7 +68,9 @@ NanoVDR-S-Multi is a **69M-parameter multilingual text-only** query encoder for
|
|
| 68 |
|
| 69 |
- **95.1% teacher retention** — a 69M text-only model recovers 95% of a 2B VLM teacher across 22 ViDoRe datasets
|
| 70 |
- **Outperforms DSE-Qwen2 (2B)** on multilingual v2 (+6.2) and v3 (+4.1) with **32x fewer parameters**
|
| 71 |
-
- **Outperforms ColPali (~3B)** on
|
|
|
|
|
|
|
| 72 |
- **51 ms CPU query latency** — 50x faster than DSE-Qwen2, 143x faster than ColPali
|
| 73 |
- **6 languages**: English, German, French, Spanish, Italian, Portuguese — all >92% teacher retention
|
| 74 |
|
|
|
|
| 68 |
|
| 69 |
- **95.1% teacher retention** — a 69M text-only model recovers 95% of a 2B VLM teacher across 22 ViDoRe datasets
|
| 70 |
- **Outperforms DSE-Qwen2 (2B)** on multilingual v2 (+6.2) and v3 (+4.1) with **32x fewer parameters**
|
| 71 |
+
- **Outperforms ColPali (~3B)** on multilingual v2 (+7.2) and v3 (+4.5) with **single-vector cosine** retrieval (no MaxSim)
|
| 72 |
+
- **Single-vector retrieval** — queries and documents share the same 2048-dim embedding space as [Qwen3-VL-Embedding-2B](https://huggingface.co/Qwen/Qwen3-VL-Embedding-2B); retrieval is a plain dot product, FAISS-compatible, **4 KB per page** (float16)
|
| 73 |
+
- **Lightweight on storage** — 282 MB model file; doc index costs 64× less than ColPali's multi-vector patches
|
| 74 |
- **51 ms CPU query latency** — 50x faster than DSE-Qwen2, 143x faster than ColPali
|
| 75 |
- **6 languages**: English, German, French, Spanish, Italian, Portuguese — all >92% teacher retention
|
| 76 |
|