Text Ranking
Transformers
Safetensors
sentence-transformers
qwen3_vl
image-text-to-text
multimodal rerank
text rerank
Instructions to use Qwen/Qwen3-VL-Reranker-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Qwen/Qwen3-VL-Reranker-2B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Qwen/Qwen3-VL-Reranker-2B") model = AutoModelForImageTextToText.from_pretrained("Qwen/Qwen3-VL-Reranker-2B") - sentence-transformers
How to use Qwen/Qwen3-VL-Reranker-2B with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("Qwen/Qwen3-VL-Reranker-2B") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Improve model card: add pipeline tag, library name, and paper link
#8
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,17 +1,22 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
base_model:
|
| 4 |
- Qwen/Qwen3-VL-2B-Instruct
|
|
|
|
|
|
|
|
|
|
| 5 |
tags:
|
| 6 |
- transformers
|
| 7 |
- multimodal rerank
|
| 8 |
---
|
|
|
|
| 9 |
# Qwen3-VL-Reranker-2B
|
| 10 |
|
| 11 |
<p align="center">
|
| 12 |
<img src="https://model-demo.oss-cn-hangzhou.aliyuncs.com/Qwen3-VL-Reranker.png" width="400"/>
|
| 13 |
<p>
|
| 14 |
|
|
|
|
|
|
|
| 15 |
## Highlights
|
| 16 |
|
| 17 |
The **Qwen3-VL-Embedding** and **Qwen3-VL-Reranker** model series are the latest additions to the Qwen family, built upon the recently open-sourced and powerful Qwen3-VL foundation model. Specifically designed for multimodal information retrieval and cross-modal understanding, this suite accepts diverse inputs including text, images, screenshots, and videos, as well as inputs containing a mixture of these modalities.
|
|
@@ -49,7 +54,6 @@ For more details, including benchmark evaluation, hardware requirements, and inf
|
|
| 49 |
> - `Quantization Support` indicates the supported quantization post process for the output embedding.
|
| 50 |
> - `MRL Support` indicates whether the embedding model supports custom dimensions for the final embedding.
|
| 51 |
> - `Instruction Aware` notes whether the embedding or reranking model supports customizing the input instruction according to different tasks.
|
| 52 |
-
> Our evaluation indicates that, for most downstream tasks, using instructions (instruct) typically yields an improvement of 1% to 5% compared to not using them. Therefore, we recommend that developers create tailored instructions specific to their tasks and scenarios. In multilingual contexts, we also advise users to write their instructions in English, as most instructions utilized during the model training process were originally written in English.
|
| 53 |
|
| 54 |
## Model Performance
|
| 55 |
|
|
@@ -187,7 +191,8 @@ def main():
|
|
| 187 |
|
| 188 |
for query_dict in queries:
|
| 189 |
query_text = query_dict.get('text', '')
|
| 190 |
-
print(f"
|
|
|
|
| 191 |
|
| 192 |
scores = []
|
| 193 |
for doc_dict in documents:
|
|
@@ -210,10 +215,10 @@ For more usage examples, please visit our [GitHub repository](https://github.com
|
|
| 210 |
|
| 211 |
If you find our work helpful, feel free to give us a cite.
|
| 212 |
|
| 213 |
-
```
|
| 214 |
@article{qwen3vlembedding,
|
| 215 |
title={Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking},
|
| 216 |
-
author={Li, Mingxin and Zhang, Yanzhao and Long, Dingkun and Chen Keqin and Song, Sibo and Bai, Shuai and Yang, Zhibo and Xie, Pengjun and Yang, An and Liu, Dayiheng and Zhou, Jingren and Lin, Junyang},
|
| 217 |
journal={arXiv},
|
| 218 |
year={2026}
|
| 219 |
}
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
| 3 |
- Qwen/Qwen3-VL-2B-Instruct
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
library_name: transformers
|
| 6 |
+
pipeline_tag: text-ranking
|
| 7 |
tags:
|
| 8 |
- transformers
|
| 9 |
- multimodal rerank
|
| 10 |
---
|
| 11 |
+
|
| 12 |
# Qwen3-VL-Reranker-2B
|
| 13 |
|
| 14 |
<p align="center">
|
| 15 |
<img src="https://model-demo.oss-cn-hangzhou.aliyuncs.com/Qwen3-VL-Reranker.png" width="400"/>
|
| 16 |
<p>
|
| 17 |
|
| 18 |
+
This repository contains the **Qwen3-VL-Reranker-2B** model, as presented in the paper [Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking](https://huggingface.co/papers/2601.04720).
|
| 19 |
+
|
| 20 |
## Highlights
|
| 21 |
|
| 22 |
The **Qwen3-VL-Embedding** and **Qwen3-VL-Reranker** model series are the latest additions to the Qwen family, built upon the recently open-sourced and powerful Qwen3-VL foundation model. Specifically designed for multimodal information retrieval and cross-modal understanding, this suite accepts diverse inputs including text, images, screenshots, and videos, as well as inputs containing a mixture of these modalities.
|
|
|
|
| 54 |
> - `Quantization Support` indicates the supported quantization post process for the output embedding.
|
| 55 |
> - `MRL Support` indicates whether the embedding model supports custom dimensions for the final embedding.
|
| 56 |
> - `Instruction Aware` notes whether the embedding or reranking model supports customizing the input instruction according to different tasks.
|
|
|
|
| 57 |
|
| 58 |
## Model Performance
|
| 59 |
|
|
|
|
| 191 |
|
| 192 |
for query_dict in queries:
|
| 193 |
query_text = query_dict.get('text', '')
|
| 194 |
+
print(f"
|
| 195 |
+
Query: {query_text}")
|
| 196 |
|
| 197 |
scores = []
|
| 198 |
for doc_dict in documents:
|
|
|
|
| 215 |
|
| 216 |
If you find our work helpful, feel free to give us a cite.
|
| 217 |
|
| 218 |
+
```bibtex
|
| 219 |
@article{qwen3vlembedding,
|
| 220 |
title={Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking},
|
| 221 |
+
author={Li, Mingxin and Zhang, Yanzhao and Long, Dingkun and Chen, Keqin and Song, Sibo and Bai, Shuai and Yang, Zhibo and Xie, Pengjun and Yang, An and Liu, Dayiheng and Zhou, Jingren and Lin, Junyang},
|
| 222 |
journal={arXiv},
|
| 223 |
year={2026}
|
| 224 |
}
|