Safetensors
qwen2_vl

RIME-7B

RIME (Rewrite-drIven Multimodal Embedding) model based on Qwen2-VL-7B-Instruct.

Model Description

RIME jointly optimizes generation and embedding through a retrieval-friendly rewrite paradigm, producing both discriminative and generative multimodal embeddings for text, images, videos, and visual documents.

Usage

See the RIME repository for inference and evaluation examples.

Citation

@article{wu2026beyond,
  title={Beyond Chain-of-Thought: Rewrite as a Universal Interface for Generative Multimodal Embeddings},
  author={Wu, Peixi and Mei, Ke and Ma, Feipeng and Chai, Bosong and Lan, Zhibin and Zhao, Chenxi and Yan, Shannan and Chen, Jie and Hu, Zhangchi and Peng, Yansong and others},
  journal={arXiv preprint arXiv:2604.22280},
  year={2026}
}
Downloads last month
2
Safetensors
Model size
877k params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for leafyseay/RIME-7B