DDDDZQ commited on
Commit
6d5f4f5
·
verified ·
1 Parent(s): a02ae08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -13
README.md CHANGED
@@ -10,7 +10,7 @@ base_model:
10
  # Lychee-rerank-mm
11
 
12
  `Lychee-rerank-mm` is the latest generalist multimodal reranking model developed based on the `Qwen2.5-VL-Instruct` foundation model. It is designed for reranking tasks in image-text multimodal retrieval scenarios.
13
- `Lychee-rerank-mm` is jointly developed by the NLP Team of Harbin Institute of Technology, Shenzhen, and the 3B/7B parameter versions are released as open source.
14
 
15
  ![The model's framework](images/model_arch.png)
16
 
@@ -19,7 +19,7 @@ base_model:
19
 
20
  - Model Type: Multimodal Reranking
21
  - Language Support: en
22
- - Param Size: 3B/7B
23
  - Model Precision: BF16
24
 
25
  For more details, please refer to our paper.
@@ -29,8 +29,7 @@ For more details, please refer to our paper.
29
 
30
  | Model Type | Models | Size | Instruction Aware |
31
  |------------------------|----------------------|------|----------------------|
32
- | Multimodal Reranking | [lychee-rerank-mm-3B](https://huggingface.co/vec-ai/lychee-rerank-mm-3b) | 3.75B | Yes |
33
- | Multimodal Reranking | [lychee-rerank-mm-7B](https://huggingface.co/vec-ai/lychee-rerank-mm-7b) | 8.29B | Yes |
34
 
35
  > **Note**:
36
  > - `Instruction Aware` notes whether the reranking model supports customizing the input instruction according to different tasks.
@@ -183,19 +182,30 @@ print("scores: ", scores)
183
 
184
  ## Evaluation
185
 
186
- | Model | Param | T→T (14) | I→I (1) | T→I (4) | T→VD (5) | I→T (5) | T→IT (2) | IT→T (4) | IT→I (2) | IT→IT (3) | ALL (40) |
187
- |-------------|-------|----------|---------|---------|----------|---------|----------|----------|----------|-----------|----------|
188
- | GME-2B | 2.21B | 49.59 | 30.75 | 48.46 | 66.39 | 52.62 | 77.02 | 39.88 | 36.70 | 66.89 | 52.54 |
189
  ||
190
- | Qwen3-Reranker | 4.02B | 60.49| -- | -- | -- | -- | -- | -- | -- | -- | -- |
191
- | Jina-rerank-m0 | 2.21B | 55.36 | 27.50 | 59.46| 73.13| 55.43 | 74.95 | 27.82 | 37.65 | 51.54 | 54.36 |
192
- | MonoQwen2-VL-v0.1 | 2.21B | 48.89 | 12.59 | 58.73 | 71.29 | 19.62 | 76.46 | 14.35 | 31.75 | 35.83 | 44.20 |
193
  ||
194
- | **lychee-rerank-mm-3B** | 3.75B | 59.22 | 29.76| 58.85 | 72.38 | 63.06| 81.96| 48.81| 43.97| 79.08 | 61.40|
195
- | **lychee-rerank-mm-7B** | 8.29B | 61.08| 32.83| 61.18| 72.94| 66.61| 84.55| 53.29| 47.39| 82.19 | 63.85|
196
 
197
  For more details, please refer to our paper.
198
 
199
  ## Citation
200
 
201
- If you find our work helpful, feel free to give us a cite.(coming soon)
 
 
 
 
 
 
 
 
 
 
 
 
10
  # Lychee-rerank-mm
11
 
12
  `Lychee-rerank-mm` is the latest generalist multimodal reranking model developed based on the `Qwen2.5-VL-Instruct` foundation model. It is designed for reranking tasks in image-text multimodal retrieval scenarios.
13
+ `Lychee-rerank-mm` is jointly developed by the NLP Team of Harbin Institute of Technology, Shenzhen, and the 7B parameter versions are released as open source.
14
 
15
  ![The model's framework](images/model_arch.png)
16
 
 
19
 
20
  - Model Type: Multimodal Reranking
21
  - Language Support: en
22
+ - Param Size: 7B
23
  - Model Precision: BF16
24
 
25
  For more details, please refer to our paper.
 
29
 
30
  | Model Type | Models | Size | Instruction Aware |
31
  |------------------------|----------------------|------|----------------------|
32
+ | Multimodal Reranking | [lychee-rerank-mm](https://huggingface.co/vec-ai/lychee-rerank-mm) | 8.29B | Yes |
 
33
 
34
  > **Note**:
35
  > - `Instruction Aware` notes whether the reranking model supports customizing the input instruction according to different tasks.
 
182
 
183
  ## Evaluation
184
 
185
+ | Model | Param | ALL (40) | T→T (14) | I→I (1) | T→I (4) | T→VD (5) | I→T (5) | T→IT (2) | IT→T (4) | IT→I (2) | IT→IT (3) |
186
+ |-------------|-------|----------|----------|---------|---------|----------|---------|----------|----------|----------|-----------|
187
+ | GME-2B | 2.21B | 52.54 | 49.59 | 30.75 | 48.46 | 66.39 | 52.62 | 77.02 | 39.88 | 36.70 | 66.89 |
188
  ||
189
+ | Qwen3-Reranker | 4.02B | -- | 60.49| -- | -- | -- | -- | -- | -- | -- | -- |
190
+ | Jina-rerank-m0 | 2.21B | 54.36 | 55.36 | 27.50 | 59.46| 73.13| 55.43 | 74.95 | 27.82 | 37.65 | 51.54 |
191
+ | MonoQwen2-VL-v0.1 | 2.21B | 44.20 | 48.89 | 12.59 | 58.73 | 71.29 | 19.62 | 76.46 | 14.35 | 31.75 | 35.83 |
192
  ||
193
+ | **lychee-rerank-mm-3B** | 3.75B | 61.40| 59.22 | 29.76| 58.85 | 72.38 | 63.06| 81.96| 48.81| 43.97| 79.08 |
194
+ | **lychee-rerank-mm-7B** | 8.29B | 63.85| 61.08| 32.83| 61.18| 72.94| 66.61| 84.55| 53.29| 47.39| 82.19 |
195
 
196
  For more details, please refer to our paper.
197
 
198
  ## Citation
199
 
200
+ If you find our work helpful, feel free to give us a cite.
201
+ ```
202
+ @misc{dai2025supervisedfinetuningcontrastivelearning,
203
+ title={Supervised Fine-Tuning or Contrastive Learning? Towards Better Multimodal LLM Reranking},
204
+ author={Ziqi Dai and Xin Zhang and Mingxin Li and Yanzhao Zhang and Dingkun Long and Pengjun Xie and Meishan Zhang and Wenjie Li and Min Zhang},
205
+ year={2025},
206
+ eprint={2510.14824},
207
+ archivePrefix={arXiv},
208
+ primaryClass={cs.CL},
209
+ url={https://arxiv.org/abs/2510.14824},
210
+ }
211
+ ```