Add video-text-to-text pipeline tag, link paper, project page, and code to model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -1,7 +1,8 @@
1
  ---
2
- license: apache-2.0
3
- library_name: transformers
4
  base_model: Qwen/Qwen3-VL-4B-Instruct
 
 
 
5
  tags:
6
  - video-retrieval
7
  - temporal-grounding
@@ -10,13 +11,29 @@ tags:
10
 
11
  # VideoSearch-R1 ActivityNet Stage 2
12
 
13
- This is the Stage 2 VideoSearch-R1 checkpoint trained for ActivityNet.
 
 
 
14
 
15
  Stage 2 starts from the ActivityNet Stage 1 checkpoint and optimizes iterative retrieval and temporal grounding behavior with the VideoSearch-R1 training pipeline.
16
 
 
 
17
  Use with the VideoSearch-R1 codebase:
18
 
19
  ```bash
20
  bash scripts/data_construct/download_preextracted_data.bash activitynet
21
  EVAL_GPUS=0 bash scripts/inference/inference.bash activitynet --checkpoint VideoSearchR1/activitynet-stage2
22
  ```
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  base_model: Qwen/Qwen3-VL-4B-Instruct
3
+ library_name: transformers
4
+ license: apache-2.0
5
+ pipeline_tag: video-text-to-text
6
  tags:
7
  - video-retrieval
8
  - temporal-grounding
 
11
 
12
  # VideoSearch-R1 ActivityNet Stage 2
13
 
14
+ This is the Stage 2 VideoSearch-R1 checkpoint trained for ActivityNet, presented in the paper [VideoSearch-R1: Iterative Video Retrieval and Reasoning via Soft Query Refinement](https://huggingface.co/papers/2607.00446).
15
+
16
+ - **Project Page:** [https://mlvlab.github.io/VideoSearch-R1/](https://mlvlab.github.io/VideoSearch-R1/)
17
+ - **Repository:** [https://github.com/mlvlab/VideoSearch-R1](https://github.com/mlvlab/VideoSearch-R1)
18
 
19
  Stage 2 starts from the ActivityNet Stage 1 checkpoint and optimizes iterative retrieval and temporal grounding behavior with the VideoSearch-R1 training pipeline.
20
 
21
+ ## Usage
22
+
23
  Use with the VideoSearch-R1 codebase:
24
 
25
  ```bash
26
  bash scripts/data_construct/download_preextracted_data.bash activitynet
27
  EVAL_GPUS=0 bash scripts/inference/inference.bash activitynet --checkpoint VideoSearchR1/activitynet-stage2
28
  ```
29
+
30
+ ## Citation
31
+
32
+ ```bibtex
33
+ @inproceedings{lee2026videosearchr1,
34
+ title = {VideoSearch-R1: Iterative Video Retrieval and Reasoning via Soft Query Refinement},
35
+ author = {Lee, Seohyun and Choi, Seoung and Ko, Dohwan and Kim, Jongha and Kim, Hyunwoo J.},
36
+ booktitle = {European Conference on Computer Vision (ECCV)},
37
+ year = {2026}
38
+ }
39
+ ```