OpenSearch-AI
/

Ops-Colqwen3-4B

@@ -12,11 +12,11 @@ tags:
 - multilingual-embedding
 - colqwen3
 ---
-# OpenSearch-AI/Ops-ColQwen3-4B
-**Ops-ColQwen3-4B** is a ColPali-style multimodal embedding model based on the **Qwen3-VL-4B-Instruct** architecture, developed and open-sourced by the Alibaba Cloud OpenSearch-AI team. It maps text queries and visual documents such as images and PDF pages into a unified, aligned **multi-vector embedding space**, enabling highly effective retrieval of visual documents.
-The model is trained using a multi-stage strategy that combines large-scale text-based retrieval datasets with diverse visual document data. This hybrid training approach significantly enhances its capability to handle complex document understanding and retrieval tasks. On the Vidore v1–v3 benchmarks, **Ops-ColQwen3-4B** achieves **state-of-the-art results** among models of comparable size.
 ## Key Features
@@ -70,11 +70,11 @@ print(f"Scores:\n{scores}")
 | Model                                      | Dim  | Vidore v1+v2 | Vidore v2 | Vidore v1 |
 |--------------------------------------------|------|--------------|-----------|-----------|
-| **Ops-ColQwen3-4B**                        | 2560 | **84.87**    | **68.7**  | **91.4**  |
-| **Ops-ColQwen3-4B**                        | 1280 | 84.71        | 68.2      | 91.3      |
-| **Ops-ColQwen3-4B**                        | 640  | 84.39        | 67.7      | 91.1      |
-| **Ops-ColQwen3-4B**                        | 320  | 84.12        | 67.0      | 91.0      |
-| **Ops-ColQwen3-4B**                        | 128  | 84.04        | 66.9      | 90.9      |
 | tomoro-colqwen3-embed-8b                   | 320  | 83.52        | 65.4      | 90.8      |
 | EvoQwen2.5-VL-Retriever-7B-v1              | 128  | 83.41        | 65.2      | 90.7      |
 | tomoro-colqwen3-embed-4b                   | 320  | 83.18        | 64.7      | 90.6      |
@@ -90,11 +90,11 @@ print(f"Scores:\n{scores}")
 | Model                                      | Dim  | PUB AVG |
 |--------------------------------------------|------|---------|
-| **Ops-ColQwen3-4B**                        | 2560 |  61.27  |
-| **Ops-ColQwen3-4B**                        | 1280 |  **61.32**  |
-| **Ops-ColQwen3-4B**                        | 640  |  61.21  |
-| **Ops-ColQwen3-4B**                        | 320  |  60.88  |
-| **Ops-ColQwen3-4B**                        | 128  |  60.23  |
 | tomoro-colqwen3-embed-4b          | 320  |  60.19  |
 | SauerkrautLM-ColQwen3-8b-v0.1              | 128  |  58.55  |
 | jina-embedding-v4                          | 128  |  57.54  |
@@ -102,7 +102,7 @@ print(f"Scores:\n{scores}")
 | SauerkrautLM-ColQwen3-4b-v0.1              | 128  |  56.03  |
-> With only **128 dimensions**, `Ops-ColQwen3-4B` outperforms other 4B-parameter models such as `tomoro-colqwen3-embed-4b`, making it well-suited for latency- and memory-constrained applications.
 ## Citation
@@ -112,8 +112,8 @@ If you use this model in your work, please cite:
 ```bibtex
 @misc{ops_colqwen3_4b,
   author       = {{OpenSearch-AI}},
-  title        = {{Ops-ColQwen3: State-of-the-Art Multimodal Embedding Model for Visual Document Retrieval}},
   year         = {2026},
-  howpublished = {\url{https://huggingface.co/OpenSearch-AI/Ops-ColQwen3-4B}},
 }
 ```

 - multilingual-embedding
 - colqwen3
 ---
+# OpenSearch-AI/Ops-Colqwen3-4B
+**Ops-Colqwen3-4B** is a ColPali-style multimodal embedding model based on the **Qwen3-VL-4B-Instruct** architecture, developed and open-sourced by the Alibaba Cloud OpenSearch-AI team. It maps text queries and visual documents such as images and PDF pages into a unified, aligned **multi-vector embedding space**, enabling highly effective retrieval of visual documents.
+The model is trained using a multi-stage strategy that combines large-scale text-based retrieval datasets with diverse visual document data. This hybrid training approach significantly enhances its capability to handle complex document understanding and retrieval tasks. On the Vidore v1–v3 benchmarks, **Ops-Colqwen3-4B** achieves **state-of-the-art results** among models of comparable size.
 ## Key Features
 | Model                                      | Dim  | Vidore v1+v2 | Vidore v2 | Vidore v1 |
 |--------------------------------------------|------|--------------|-----------|-----------|
+| **Ops-Colqwen3-4B**                        | 2560 | **84.87**    | **68.7**  | **91.4**  |
+| **Ops-Colqwen3-4B**                        | 1280 | 84.71        | 68.2      | 91.3      |
+| **Ops-Colqwen3-4B**                        | 640  | 84.39        | 67.7      | 91.1      |
+| **Ops-Colqwen3-4B**                        | 320  | 84.12        | 67.0      | 91.0      |
+| **Ops-Colqwen3-4B**                        | 128  | 84.04        | 66.9      | 90.9      |
 | tomoro-colqwen3-embed-8b                   | 320  | 83.52        | 65.4      | 90.8      |
 | EvoQwen2.5-VL-Retriever-7B-v1              | 128  | 83.41        | 65.2      | 90.7      |
 | tomoro-colqwen3-embed-4b                   | 320  | 83.18        | 64.7      | 90.6      |
 | Model                                      | Dim  | PUB AVG |
 |--------------------------------------------|------|---------|
+| **Ops-Colqwen3-4B**                        | 2560 |  61.27  |
+| **Ops-Colqwen3-4B**                        | 1280 |  **61.32**  |
+| **Ops-Colqwen3-4B**                        | 640  |  61.21  |
+| **Ops-Colqwen3-4B**                        | 320  |  60.88  |
+| **Ops-Colqwen3-4B**                        | 128  |  60.23  |
 | tomoro-colqwen3-embed-4b          | 320  |  60.19  |
 | SauerkrautLM-ColQwen3-8b-v0.1              | 128  |  58.55  |
 | jina-embedding-v4                          | 128  |  57.54  |
 | SauerkrautLM-ColQwen3-4b-v0.1              | 128  |  56.03  |
+> With only **128 dimensions**, `Ops-Colqwen3-4B` outperforms other 4B-parameter models such as `tomoro-colqwen3-embed-4b`, making it well-suited for latency- and memory-constrained applications.
 ## Citation
 ```bibtex
 @misc{ops_colqwen3_4b,
   author       = {{OpenSearch-AI}},
+  title        = {{Ops-Colqwen3: State-of-the-Art Multimodal Embedding Model for Visual Document Retrieval}},
   year         = {2026},
+  howpublished = {\url{https://huggingface.co/OpenSearch-AI/Ops-Colqwen3-4B}},
 }
 ```