TencentBAC
/

U-MARVEL-Qwen2VL-7B-Instruct

@@ -1,14 +1,18 @@
 ---
-license: apache-2.0
 datasets:
 - TIGER-Lab/M-BEIR
 language:
 - en
-base_model:
-- Qwen/Qwen2-VL-7B-Instruct
 ---
-## U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding
 Universal multimodal retrieval (UMR) addresses complex retrieval tasks involving diverse modalities for both queries and candidates. Despite the success of state-of-the-art methods based on multimodal large language models (MLLMs) using contrastive learning principles, the mechanisms underlying their retrieval capabilities remain largely unexplored. This gap potentially leads to suboptimal performance and limited generalization ability.
@@ -91,4 +95,15 @@ single-model architectures and recall-then-rerank approaches on M-BEIR benchmark
 ## Acknowledgements
-Many thanks to the code bases from **[lamra](https://github.com/Code-kunkun/LamRA)** .

 ---
+base_model:
+- Qwen/Qwen2-VL-7B-Instruct
 datasets:
 - TIGER-Lab/M-BEIR
 language:
 - en
+license: apache-2.0
+pipeline_tag: any-to-any
+library_name: transformers
 ---
+## U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs
+This repository contains the official model checkpoints and inference code for **U-MARVEL**, presented in the paper [U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs](https://huggingface.co/papers/2507.14902).
 Universal multimodal retrieval (UMR) addresses complex retrieval tasks involving diverse modalities for both queries and candidates. Despite the success of state-of-the-art methods based on multimodal large language models (MLLMs) using contrastive learning principles, the mechanisms underlying their retrieval capabilities remain largely unexplored. This gap potentially leads to suboptimal performance and limited generalization ability.
 ## Acknowledgements
+Many thanks to the code bases from **[LamRA](https://github.com/Code-kunkun/LamRA)** .
+## Citation
+If you use this code for your research or project, please cite:
+```latex
+@article{li2025umarvel,
+  title={U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs},
+  author={Li, Xiaojie and Li, Chu and Chen, Shi-Zhe and Chen, Xi},
+  journal={arXiv preprint arXiv:2507.14902},
+  year={2025}
+}
+```