--- base_model: Qwen/Qwen3-VL-8B-Instruct language: - en license: mit pipeline_tag: video-text-to-text library_name: transformers arxiv: 2602.02444 tags: - video - retrieval - reranking - qwen3-vl --- # RankVideo RankVideo is a video-native reasoning reranker for text-to-video retrieval, fine-tuned from [Qwen3-VL-8B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct). The model explicitly reasons over query-video pairs using video content to assess relevance. It was introduced in the paper [RANKVIDEO: Reasoning Reranking for Text-to-Video Retrieval](https://huggingface.co/papers/2602.02444). - **Repository:** [https://github.com/tskow99/RANKVIDEO-Reasoning-Reranker](https://github.com/tskow99/RANKVIDEO-Reasoning-Reranker) - **Paper:** [RANKVIDEO: Reasoning Reranking for Text-to-Video Retrieval](https://arxiv.org/abs/2602.02444) ## Training Data This model was trained using the [MultiVENT 2.0 dataset](https://huggingface.co/datasets/hltcoe/MultiVENT2.0). ## Usage You can use the model for scoring query-video pairs via the `rankvideo` library as follows: ```python from rankvideo import VLMReranker reranker = VLMReranker(model_path="hltcoe/RankVideo") # Score query-video pairs for relevance scores = reranker.score_batch( queries=["person playing guitar"], video_paths=["/path/to/video.mp4"], ) print(f"Relevance score: {scores[0]['logit_delta_yes_minus_no']:.3f}") ``` ## BibTeX ```bibtex @misc{skow2026rankvideoreasoningrerankingtexttovideo, title={RANKVIDEO: Reasoning Reranking for Text-to-Video Retrieval}, author={Tyler Skow and Alexander Martin and Benjamin Van Durme and Rama Chellappa and Reno Kriz}, year={2026}, eprint={2602.02444}, archivePrefix={arXiv}, primaryClass={cs.IR}, url={https://arxiv.org/abs/2602.02444}, } ```