Add video-text-to-text pipeline tag, link to paper, project page, and code

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +20 -3
README.md CHANGED
@@ -1,7 +1,8 @@
1
  ---
2
- license: apache-2.0
3
- library_name: transformers
4
  base_model: Qwen/Qwen3-VL-4B-Instruct
 
 
 
5
  tags:
6
  - video-retrieval
7
  - temporal-grounding
@@ -10,7 +11,12 @@ tags:
10
 
11
  # VideoSearch-R1 DiDeMo Stage 2
12
 
13
- This is the Stage 2 VideoSearch-R1 checkpoint trained for DiDeMo.
 
 
 
 
 
14
 
15
  Use with the VideoSearch-R1 codebase:
16
 
@@ -18,3 +24,14 @@ Use with the VideoSearch-R1 codebase:
18
  bash scripts/data_construct/download_preextracted_data.bash didemo
19
  EVAL_GPUS=0 bash scripts/inference/inference.bash didemo --checkpoint VideoSearchR1/didemo-stage2
20
  ```
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  base_model: Qwen/Qwen3-VL-4B-Instruct
3
+ library_name: transformers
4
+ license: apache-2.0
5
+ pipeline_tag: video-text-to-text
6
  tags:
7
  - video-retrieval
8
  - temporal-grounding
 
11
 
12
  # VideoSearch-R1 DiDeMo Stage 2
13
 
14
+ This is the Stage 2 VideoSearch-R1 checkpoint trained for DiDeMo, presented in the paper [VideoSearch-R1: Iterative Video Retrieval and Reasoning via Soft Query Refinement](https://huggingface.co/papers/2607.00446).
15
+
16
+ - **Project Page:** [mlvlab.github.io/VideoSearch-R1](https://mlvlab.github.io/VideoSearch-R1/)
17
+ - **Repository:** [GitHub - mlvlab/VideoSearch-R1](https://github.com/mlvlab/VideoSearch-R1)
18
+
19
+ ## Usage
20
 
21
  Use with the VideoSearch-R1 codebase:
22
 
 
24
  bash scripts/data_construct/download_preextracted_data.bash didemo
25
  EVAL_GPUS=0 bash scripts/inference/inference.bash didemo --checkpoint VideoSearchR1/didemo-stage2
26
  ```
27
+
28
+ ## Citation
29
+
30
+ ```bibtex
31
+ @inproceedings{lee2026videosearchr1,
32
+ title = {VideoSearch-R1: Iterative Video Retrieval and Reasoning via Soft Query Refinement},
33
+ author = {Lee, Seohyun and Choi, Seoung and Ko, Dohwan and Kim, Jongha and Kim, Hyunwoo J.},
34
+ booktitle = {European Conference on Computer Vision (ECCV)},
35
+ year = {2026}
36
+ }
37
+ ```