fix typo
Browse files
README.md
CHANGED
|
@@ -12,11 +12,11 @@ tags:
|
|
| 12 |
- multilingual-embedding
|
| 13 |
- colqwen3
|
| 14 |
---
|
| 15 |
-
# OpenSearch-AI/Ops-
|
| 16 |
|
| 17 |
-
**Ops-
|
| 18 |
|
| 19 |
-
The model is trained using a multi-stage strategy that combines large-scale text-based retrieval datasets with diverse visual document data. This hybrid training approach significantly enhances its capability to handle complex document understanding and retrieval tasks. On the Vidore v1–v3 benchmarks, **Ops-
|
| 20 |
|
| 21 |
## Key Features
|
| 22 |
|
|
@@ -70,11 +70,11 @@ print(f"Scores:\n{scores}")
|
|
| 70 |
|
| 71 |
| Model | Dim | Vidore v1+v2 | Vidore v2 | Vidore v1 |
|
| 72 |
|--------------------------------------------|------|--------------|-----------|-----------|
|
| 73 |
-
| **Ops-
|
| 74 |
-
| **Ops-
|
| 75 |
-
| **Ops-
|
| 76 |
-
| **Ops-
|
| 77 |
-
| **Ops-
|
| 78 |
| tomoro-colqwen3-embed-8b | 320 | 83.52 | 65.4 | 90.8 |
|
| 79 |
| EvoQwen2.5-VL-Retriever-7B-v1 | 128 | 83.41 | 65.2 | 90.7 |
|
| 80 |
| tomoro-colqwen3-embed-4b | 320 | 83.18 | 64.7 | 90.6 |
|
|
@@ -90,11 +90,11 @@ print(f"Scores:\n{scores}")
|
|
| 90 |
|
| 91 |
| Model | Dim | PUB AVG |
|
| 92 |
|--------------------------------------------|------|---------|
|
| 93 |
-
| **Ops-
|
| 94 |
-
| **Ops-
|
| 95 |
-
| **Ops-
|
| 96 |
-
| **Ops-
|
| 97 |
-
| **Ops-
|
| 98 |
| tomoro-colqwen3-embed-4b | 320 | 60.19 |
|
| 99 |
| SauerkrautLM-ColQwen3-8b-v0.1 | 128 | 58.55 |
|
| 100 |
| jina-embedding-v4 | 128 | 57.54 |
|
|
@@ -102,7 +102,7 @@ print(f"Scores:\n{scores}")
|
|
| 102 |
| SauerkrautLM-ColQwen3-4b-v0.1 | 128 | 56.03 |
|
| 103 |
|
| 104 |
|
| 105 |
-
> With only **128 dimensions**, `Ops-
|
| 106 |
|
| 107 |
|
| 108 |
## Citation
|
|
@@ -112,8 +112,8 @@ If you use this model in your work, please cite:
|
|
| 112 |
```bibtex
|
| 113 |
@misc{ops_colqwen3_4b,
|
| 114 |
author = {{OpenSearch-AI}},
|
| 115 |
-
title = {{Ops-
|
| 116 |
year = {2026},
|
| 117 |
-
howpublished = {\url{https://huggingface.co/OpenSearch-AI/Ops-
|
| 118 |
}
|
| 119 |
```
|
|
|
|
| 12 |
- multilingual-embedding
|
| 13 |
- colqwen3
|
| 14 |
---
|
| 15 |
+
# OpenSearch-AI/Ops-Colqwen3-4B
|
| 16 |
|
| 17 |
+
**Ops-Colqwen3-4B** is a ColPali-style multimodal embedding model based on the **Qwen3-VL-4B-Instruct** architecture, developed and open-sourced by the Alibaba Cloud OpenSearch-AI team. It maps text queries and visual documents such as images and PDF pages into a unified, aligned **multi-vector embedding space**, enabling highly effective retrieval of visual documents.
|
| 18 |
|
| 19 |
+
The model is trained using a multi-stage strategy that combines large-scale text-based retrieval datasets with diverse visual document data. This hybrid training approach significantly enhances its capability to handle complex document understanding and retrieval tasks. On the Vidore v1–v3 benchmarks, **Ops-Colqwen3-4B** achieves **state-of-the-art results** among models of comparable size.
|
| 20 |
|
| 21 |
## Key Features
|
| 22 |
|
|
|
|
| 70 |
|
| 71 |
| Model | Dim | Vidore v1+v2 | Vidore v2 | Vidore v1 |
|
| 72 |
|--------------------------------------------|------|--------------|-----------|-----------|
|
| 73 |
+
| **Ops-Colqwen3-4B** | 2560 | **84.87** | **68.7** | **91.4** |
|
| 74 |
+
| **Ops-Colqwen3-4B** | 1280 | 84.71 | 68.2 | 91.3 |
|
| 75 |
+
| **Ops-Colqwen3-4B** | 640 | 84.39 | 67.7 | 91.1 |
|
| 76 |
+
| **Ops-Colqwen3-4B** | 320 | 84.12 | 67.0 | 91.0 |
|
| 77 |
+
| **Ops-Colqwen3-4B** | 128 | 84.04 | 66.9 | 90.9 |
|
| 78 |
| tomoro-colqwen3-embed-8b | 320 | 83.52 | 65.4 | 90.8 |
|
| 79 |
| EvoQwen2.5-VL-Retriever-7B-v1 | 128 | 83.41 | 65.2 | 90.7 |
|
| 80 |
| tomoro-colqwen3-embed-4b | 320 | 83.18 | 64.7 | 90.6 |
|
|
|
|
| 90 |
|
| 91 |
| Model | Dim | PUB AVG |
|
| 92 |
|--------------------------------------------|------|---------|
|
| 93 |
+
| **Ops-Colqwen3-4B** | 2560 | 61.27 |
|
| 94 |
+
| **Ops-Colqwen3-4B** | 1280 | **61.32** |
|
| 95 |
+
| **Ops-Colqwen3-4B** | 640 | 61.21 |
|
| 96 |
+
| **Ops-Colqwen3-4B** | 320 | 60.88 |
|
| 97 |
+
| **Ops-Colqwen3-4B** | 128 | 60.23 |
|
| 98 |
| tomoro-colqwen3-embed-4b | 320 | 60.19 |
|
| 99 |
| SauerkrautLM-ColQwen3-8b-v0.1 | 128 | 58.55 |
|
| 100 |
| jina-embedding-v4 | 128 | 57.54 |
|
|
|
|
| 102 |
| SauerkrautLM-ColQwen3-4b-v0.1 | 128 | 56.03 |
|
| 103 |
|
| 104 |
|
| 105 |
+
> With only **128 dimensions**, `Ops-Colqwen3-4B` outperforms other 4B-parameter models such as `tomoro-colqwen3-embed-4b`, making it well-suited for latency- and memory-constrained applications.
|
| 106 |
|
| 107 |
|
| 108 |
## Citation
|
|
|
|
| 112 |
```bibtex
|
| 113 |
@misc{ops_colqwen3_4b,
|
| 114 |
author = {{OpenSearch-AI}},
|
| 115 |
+
title = {{Ops-Colqwen3: State-of-the-Art Multimodal Embedding Model for Visual Document Retrieval}},
|
| 116 |
year = {2026},
|
| 117 |
+
howpublished = {\url{https://huggingface.co/OpenSearch-AI/Ops-Colqwen3-4B}},
|
| 118 |
}
|
| 119 |
```
|