RIME-7B

RIME (Rewrite-drIven Multimodal Embedding) model based on Qwen2-VL-7B-Instruct.

Model Description

RIME jointly optimizes generation and embedding through a retrieval-friendly rewrite paradigm, producing both discriminative and generative multimodal embeddings for text, images, videos, and visual documents.

Usage

See the RIME repository for inference and evaluation examples.

Citation

@article{wu2026beyond,
  title={Beyond Chain-of-Thought: Rewrite as a Universal Interface for Generative Multimodal Embeddings},
  author={Wu, Peixi and Mei, Ke and Ma, Feipeng and Chai, Bosong and Lan, Zhibin and Zhao, Chenxi and Yan, Shannan and Chen, Jie and Hu, Zhangchi and Peng, Yansong and others},
  journal={arXiv preprint arXiv:2604.22280},
  year={2026}
}

Downloads last month: 2

Safetensors

Model size

877k params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for leafyseay/RIME-7B

Beyond Chain-of-Thought: Rewrite as a Universal Interface for Generative Multimodal Embeddings

Paper • 2604.22280 • Published 28 days ago • 1