nielsr HF Staff commited on
Commit
48ef4e2
·
verified ·
1 Parent(s): 3d0fb85

Improve model card: add pipeline tag, library name, and HF paper link

Browse files

This PR enhances the model card by:
- Adding the `pipeline_tag: any-to-any` to accurately reflect its capabilities in universal multimodal retrieval.
- Specifying `library_name: transformers` as the model is compatible with the Hugging Face Transformers library.
- Linking directly to the Hugging Face paper page ([U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs](https://huggingface.co/papers/2507.14902)) for easier access and discoverability within the Hugging Face Hub.
- Including the LaTeX citation information from the project's GitHub repository for proper attribution.

These additions improve the model's discoverability on the Hub and provide more comprehensive information to users.

Files changed (1) hide show
  1. README.md +20 -5
README.md CHANGED
@@ -1,14 +1,18 @@
1
  ---
2
- license: apache-2.0
 
3
  datasets:
4
  - TIGER-Lab/M-BEIR
5
  language:
6
  - en
7
- base_model:
8
- - Qwen/Qwen2-VL-7B-Instruct
 
9
  ---
10
 
11
- ## U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding
 
 
12
 
13
  Universal multimodal retrieval (UMR) addresses complex retrieval tasks involving diverse modalities for both queries and candidates. Despite the success of state-of-the-art methods based on multimodal large language models (MLLMs) using contrastive learning principles, the mechanisms underlying their retrieval capabilities remain largely unexplored. This gap potentially leads to suboptimal performance and limited generalization ability.
14
 
@@ -91,4 +95,15 @@ single-model architectures and recall-then-rerank approaches on M-BEIR benchmark
91
 
92
  ## Acknowledgements
93
 
94
- Many thanks to the code bases from **[lamra](https://github.com/Code-kunkun/LamRA)** .
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen2-VL-7B-Instruct
4
  datasets:
5
  - TIGER-Lab/M-BEIR
6
  language:
7
  - en
8
+ license: apache-2.0
9
+ pipeline_tag: any-to-any
10
+ library_name: transformers
11
  ---
12
 
13
+ ## U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs
14
+
15
+ This repository contains the official model checkpoints and inference code for **U-MARVEL**, presented in the paper [U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs](https://huggingface.co/papers/2507.14902).
16
 
17
  Universal multimodal retrieval (UMR) addresses complex retrieval tasks involving diverse modalities for both queries and candidates. Despite the success of state-of-the-art methods based on multimodal large language models (MLLMs) using contrastive learning principles, the mechanisms underlying their retrieval capabilities remain largely unexplored. This gap potentially leads to suboptimal performance and limited generalization ability.
18
 
 
95
 
96
  ## Acknowledgements
97
 
98
+ Many thanks to the code bases from **[LamRA](https://github.com/Code-kunkun/LamRA)** .
99
+
100
+ ## Citation
101
+ If you use this code for your research or project, please cite:
102
+ ```latex
103
+ @article{li2025umarvel,
104
+ title={U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs},
105
+ author={Li, Xiaojie and Li, Chu and Chen, Shi-Zhe and Chen, Xi},
106
+ journal={arXiv preprint arXiv:2507.14902},
107
+ year={2025}
108
+ }
109
+ ```