Spaces:

VideoSearchR1
/

README

Running

App Files Files Community

README / README.md

happy8825

Update README.md

8fb3464 verified 4 days ago

preview code

Raw

History Blame Contribute Delete

3.07 kB

	---
	title: VideoSearch-R1
	emoji: 🔎
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	app_file: app.py
	pinned: false
	license: apache-2.0
	short_description: Video retrieval with SQR.
	---

	<div align="center">

	# Welcome to VideoSearch-R1

	### Iterative Video Retrieval and Reasoning via Soft Query Refinement

	<p>
	<a href="https://github.com/mlvlab/VideoSearch-R1">
	<img src="https://img.shields.io/badge/GitHub-Code-181717?style=for-the-badge&logo=github" alt="GitHub">
	</a>
	<a href="https://mlvlab.github.io/VideoSearch-R1/">
	<img src="https://img.shields.io/badge/Project-Page-2b4f9e?style=for-the-badge" alt="Project Page">
	</a>
	<a href="https://arxiv.org/abs/2607.00446">
	<img src="https://img.shields.io/badge/arXiv-2607.00446-b31b1b?style=for-the-badge" alt="arXiv">
	</a>
	<img src="https://img.shields.io/badge/ECCV-2026-4c6fff?style=for-the-badge" alt="ECCV 2026">
	</p>

	VideoSearch-R1 is an agentic framework for video corpus moment retrieval. It unifies inter-video retrieval and intra-video temporal reasoning through a retrieve → verify → refine → ground loop, with Soft Query Refinement (SQR) in the continuous query embedding space.

	</div>

	---

	## News

	- <img src="https://img.shields.io/badge/2026.06.17-4c6fff?style=flat-square" alt="2026.06.17"> 🎉 VideoSearch-R1 is accepted to ECCV 2026.

	- <img src="https://img.shields.io/badge/2026.06.20-18a058?style=flat-square" alt="2026.06.20"> Code released.

	- <img src="https://img.shields.io/badge/2026.06.20-f59e0b?style=flat-square" alt="2026.06.20"> Trained model checkpoints released.

	- <img src="https://img.shields.io/badge/2026.07.01-b31b1b?style=flat-square" alt="2026.07.01"> Paper preprint released on [arXiv](https://arxiv.org/abs/2607.00446).

	## Released Resources

	\| Resource \| Status \| Link \|
	\|---\|---:\|---\|
	\| Code \| Released \| [mlvlab/VideoSearch-R1](https://github.com/mlvlab/VideoSearch-R1) \|
	\| Project page \| Released \| [mlvlab.github.io/VideoSearch-R1](https://mlvlab.github.io/VideoSearch-R1/) \|
	\| Trained checkpoints \| Released \| See model repos below \|
	\| Paper preprint \| Released \| [arXiv:2607.00446](https://arxiv.org/abs/2607.00446) \|

	## Model Checkpoints

	\| Dataset \| Stage 1 SFT \| Stage 2 GRPO \|
	\|---\|---\|---\|
	\| DiDeMo \| [didemo-sft](https://huggingface.co/VideoSearchR1/didemo-sft) \| [didemo-grpo](https://huggingface.co/VideoSearchR1/didemo-grpo) \|
	\| Charades-STA \| [charades-sft](https://huggingface.co/VideoSearchR1/charades-sft) \| [charades-grpo](https://huggingface.co/VideoSearchR1/charades-grpo) \|
	\| ActivityNet Captions \| Coming soon \| Coming soon \|

	## Links

	- [GitHub repository](https://github.com/mlvlab/VideoSearch-R1)
	- [Project page](https://mlvlab.github.io/VideoSearch-R1/)
	- [Paper](https://arxiv.org/abs/2607.00446)

	## Acknowledgements

	VideoSearch-R1 builds on the open-source video-language and reinforcement learning ecosystem, and evaluates on VERIFIED with ActivityNet Captions, DiDeMo, and Charades-STA. We thank the benchmark and dataset creators for making these resources available to the community.