README / README.md
happy8825's picture
Update README.md
8fb3464 verified
|
Raw
History Blame Contribute Delete
3.07 kB
---
title: VideoSearch-R1
emoji: πŸ”Ž
colorFrom: blue
colorTo: purple
sdk: gradio
app_file: app.py
pinned: false
license: apache-2.0
short_description: Video retrieval with SQR.
---
<div align="center">
# Welcome to VideoSearch-R1
### Iterative Video Retrieval and Reasoning via Soft Query Refinement
<p>
<a href="https://github.com/mlvlab/VideoSearch-R1">
<img src="https://img.shields.io/badge/GitHub-Code-181717?style=for-the-badge&logo=github" alt="GitHub">
</a>
<a href="https://mlvlab.github.io/VideoSearch-R1/">
<img src="https://img.shields.io/badge/Project-Page-2b4f9e?style=for-the-badge" alt="Project Page">
</a>
<a href="https://arxiv.org/abs/2607.00446">
<img src="https://img.shields.io/badge/arXiv-2607.00446-b31b1b?style=for-the-badge" alt="arXiv">
</a>
<img src="https://img.shields.io/badge/ECCV-2026-4c6fff?style=for-the-badge" alt="ECCV 2026">
</p>
**VideoSearch-R1** is an agentic framework for video corpus moment retrieval. It unifies inter-video retrieval and intra-video temporal reasoning through a retrieve β†’ verify β†’ refine β†’ ground loop, with **Soft Query Refinement (SQR)** in the continuous query embedding space.
</div>
---
## News
- <img src="https://img.shields.io/badge/2026.06.17-4c6fff?style=flat-square" alt="2026.06.17"> πŸŽ‰ VideoSearch-R1 is accepted to **ECCV 2026**.
- <img src="https://img.shields.io/badge/2026.06.20-18a058?style=flat-square" alt="2026.06.20"> Code released.
- <img src="https://img.shields.io/badge/2026.06.20-f59e0b?style=flat-square" alt="2026.06.20"> Trained model checkpoints released.
- <img src="https://img.shields.io/badge/2026.07.01-b31b1b?style=flat-square" alt="2026.07.01"> Paper preprint released on [arXiv](https://arxiv.org/abs/2607.00446).
## Released Resources
| Resource | Status | Link |
|---|---:|---|
| Code | Released | [mlvlab/VideoSearch-R1](https://github.com/mlvlab/VideoSearch-R1) |
| Project page | Released | [mlvlab.github.io/VideoSearch-R1](https://mlvlab.github.io/VideoSearch-R1/) |
| Trained checkpoints | Released | See model repos below |
| Paper preprint | Released | [arXiv:2607.00446](https://arxiv.org/abs/2607.00446) |
## Model Checkpoints
| Dataset | Stage 1 SFT | Stage 2 GRPO |
|---|---|---|
| DiDeMo | [didemo-sft](https://huggingface.co/VideoSearchR1/didemo-sft) | [didemo-grpo](https://huggingface.co/VideoSearchR1/didemo-grpo) |
| Charades-STA | [charades-sft](https://huggingface.co/VideoSearchR1/charades-sft) | [charades-grpo](https://huggingface.co/VideoSearchR1/charades-grpo) |
| ActivityNet Captions | Coming soon | Coming soon |
## Links
- [GitHub repository](https://github.com/mlvlab/VideoSearch-R1)
- [Project page](https://mlvlab.github.io/VideoSearch-R1/)
- [Paper](https://arxiv.org/abs/2607.00446)
## Acknowledgements
VideoSearch-R1 builds on the open-source video-language and reinforcement learning ecosystem, and evaluates on VERIFIED with ActivityNet Captions, DiDeMo, and Charades-STA. We thank the benchmark and dataset creators for making these resources available to the community.