Spaces:
Running
Running
A newer version of the Gradio SDK is available: 6.19.0
metadata
title: VideoSearch-R1
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
app_file: app.py
pinned: false
license: apache-2.0
short_description: Video retrieval with SQR.
Welcome to VideoSearch-R1
Iterative Video Retrieval and Reasoning via Soft Query Refinement
VideoSearch-R1 is an agentic framework for video corpus moment retrieval. It unifies inter-video retrieval and intra-video temporal reasoning through a retrieve β verify β refine β ground loop, with Soft Query Refinement (SQR) in the continuous query embedding space.
News
π VideoSearch-R1 is accepted to ECCV 2026.
Code released.
Trained model checkpoints released.
Paper preprint released on arXiv.
Released Resources
| Resource | Status | Link |
|---|---|---|
| Code | Released | mlvlab/VideoSearch-R1 |
| Project page | Released | mlvlab.github.io/VideoSearch-R1 |
| Trained checkpoints | Released | See model repos below |
| Paper preprint | Released | arXiv:2607.00446 |
Model Checkpoints
| Dataset | Stage 1 SFT | Stage 2 GRPO |
|---|---|---|
| DiDeMo | didemo-sft | didemo-grpo |
| Charades-STA | charades-sft | charades-grpo |
| ActivityNet Captions | Coming soon | Coming soon |
Links
Acknowledgements
VideoSearch-R1 builds on the open-source video-language and reinforcement learning ecosystem, and evaluates on VERIFIED with ActivityNet Captions, DiDeMo, and Charades-STA. We thank the benchmark and dataset creators for making these resources available to the community.