RankVideo / README.md
nielsr's picture
nielsr HF Staff
Add GitHub link, paper metadata, and improve model card
ed8a342 verified
|
raw
history blame
1.83 kB
---
base_model: Qwen/Qwen3-VL-8B-Instruct
language:
- en
license: mit
pipeline_tag: video-text-to-text
library_name: transformers
arxiv: 2602.02444
tags:
- video
- retrieval
- reranking
- qwen3-vl
---
# RankVideo
RankVideo is a video-native reasoning reranker for text-to-video retrieval, fine-tuned from [Qwen3-VL-8B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct).
The model explicitly reasons over query-video pairs using video content to assess relevance. It was introduced in the paper [RANKVIDEO: Reasoning Reranking for Text-to-Video Retrieval](https://huggingface.co/papers/2602.02444).
- **Repository:** [https://github.com/tskow99/RANKVIDEO-Reasoning-Reranker](https://github.com/tskow99/RANKVIDEO-Reasoning-Reranker)
- **Paper:** [RANKVIDEO: Reasoning Reranking for Text-to-Video Retrieval](https://arxiv.org/abs/2602.02444)
## Training Data
This model was trained using the [MultiVENT 2.0 dataset](https://huggingface.co/datasets/hltcoe/MultiVENT2.0).
## Usage
You can use the model for scoring query-video pairs via the `rankvideo` library as follows:
```python
from rankvideo import VLMReranker
reranker = VLMReranker(model_path="hltcoe/RankVideo")
# Score query-video pairs for relevance
scores = reranker.score_batch(
queries=["person playing guitar"],
video_paths=["/path/to/video.mp4"],
)
print(f"Relevance score: {scores[0]['logit_delta_yes_minus_no']:.3f}")
```
## BibTeX
```bibtex
@misc{skow2026rankvideoreasoningrerankingtexttovideo,
title={RANKVIDEO: Reasoning Reranking for Text-to-Video Retrieval},
author={Tyler Skow and Alexander Martin and Benjamin Van Durme and Rama Chellappa and Reno Kriz},
year={2026},
eprint={2602.02444},
archivePrefix={arXiv},
primaryClass={cs.IR},
url={https://arxiv.org/abs/2602.02444},
}
```