VideoSearchR1
/

activitynet-stage2

@@ -1,7 +1,8 @@
 ---
-license: apache-2.0
-library_name: transformers
 base_model: Qwen/Qwen3-VL-4B-Instruct
 tags:
 - video-retrieval
 - temporal-grounding
@@ -10,13 +11,29 @@ tags:
 # VideoSearch-R1 ActivityNet Stage 2
-This is the Stage 2 VideoSearch-R1 checkpoint trained for ActivityNet.
 Stage 2 starts from the ActivityNet Stage 1 checkpoint and optimizes iterative retrieval and temporal grounding behavior with the VideoSearch-R1 training pipeline.
 Use with the VideoSearch-R1 codebase:
 ```bash
 bash scripts/data_construct/download_preextracted_data.bash activitynet
 EVAL_GPUS=0 bash scripts/inference/inference.bash activitynet --checkpoint VideoSearchR1/activitynet-stage2
 ```

 ---
 base_model: Qwen/Qwen3-VL-4B-Instruct
+library_name: transformers
+license: apache-2.0
+pipeline_tag: video-text-to-text
 tags:
 - video-retrieval
 - temporal-grounding
 # VideoSearch-R1 ActivityNet Stage 2
+This is the Stage 2 VideoSearch-R1 checkpoint trained for ActivityNet, presented in the paper [VideoSearch-R1: Iterative Video Retrieval and Reasoning via Soft Query Refinement](https://huggingface.co/papers/2607.00446).
+- **Project Page:** [https://mlvlab.github.io/VideoSearch-R1/](https://mlvlab.github.io/VideoSearch-R1/)
+- **Repository:** [https://github.com/mlvlab/VideoSearch-R1](https://github.com/mlvlab/VideoSearch-R1)
 Stage 2 starts from the ActivityNet Stage 1 checkpoint and optimizes iterative retrieval and temporal grounding behavior with the VideoSearch-R1 training pipeline.
+## Usage
 Use with the VideoSearch-R1 codebase:
 ```bash
 bash scripts/data_construct/download_preextracted_data.bash activitynet
 EVAL_GPUS=0 bash scripts/inference/inference.bash activitynet --checkpoint VideoSearchR1/activitynet-stage2
 ```
+## Citation
+```bibtex
+@inproceedings{lee2026videosearchr1,
+  title     = {VideoSearch-R1: Iterative Video Retrieval and Reasoning via Soft Query Refinement},
+  author    = {Lee, Seohyun and Choi, Seoung and Ko, Dohwan and Kim, Jongha and Kim, Hyunwoo J.},
+  booktitle = {European Conference on Computer Vision (ECCV)},
+  year      = {2026}
+}
+```