Image-Text-to-Text
Transformers
Safetensors
qwen3_vl
conversational
nielsr HF Staff commited on
Commit
382ec3b
·
verified ·
1 Parent(s): 19e19e0

Add pipeline tag, library metadata and research links

Browse files

Hi! I'm Niels from the Hugging Face community team. I've opened this PR to improve the model card for Gen-Searcher-8B.

Specifically, I have:
- Added the `library_name: transformers` and `pipeline_tag: image-text-to-text` metadata.
- Specified the `license: apache-2.0`.
- Added links to the paper, the official project page, and the GitHub repository for better discoverability and documentation.
- Included a BibTeX citation section.

These changes will help users discover and use the model more effectively on the Hub.

Files changed (1) hide show
  1. README.md +24 -17
README.md CHANGED
@@ -1,29 +1,25 @@
1
  ---
2
- datasets:
3
- - GenSearcher/Train-Data
4
  base_model:
5
  - Qwen/Qwen3-VL-8B-Instruct
 
 
 
 
 
6
  ---
7
 
8
  # Gen-Searcher-8B Model
9
 
 
10
 
11
- This repository contains the Gen-Searcher-8B model presented in: [Gen-Searcher](https://arxiv.org/abs/2603.28767)
12
-
13
- For inference, please refer to:
14
-
15
- Code: https://github.com/tulerfeng/Gen-Searcher
16
-
17
-
18
-
19
 
20
  # 👀 Intro
21
 
22
  <div align="center">
23
- <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/teaser.jpg?raw=true" alt="Descriptive alt text" width="80%">
24
  </div>
25
 
26
-
27
  We introduce **Gen-Searcher**, as the first attempt to train a multimodal **deep research agent** for image generation that requires complex real-world knowledge. Gen-Searcher can **search the web, browse evidence, reason over multiple sources, and search visual references** before generation, enabling more accurate and up-to-date image synthesis in real-world scenarios.
28
 
29
  We build two dedicated training datasets **Gen-Searcher-SFT-10k**, **Gen-Searcher-RL-6k** and one new benchmark **KnowGen** for search-grounded image generation.
@@ -32,18 +28,29 @@ Gen-Searcher achieves significant improvements, delivering **15+ point gains on
32
 
33
  All code, models, data, and benchmark are fully released.
34
 
35
-
36
-
37
-
38
  ## 🎥 Demo
39
 
40
  #### Inference Process Example
41
 
42
  <div align="center">
43
- <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/example.jpg?raw=true" alt="Descriptive alt text" width="85%">
44
  </div>
45
 
46
-
47
  For more examples, please refer to our website [[🌐Project Page]](https://gen-searcher.vercel.app/)
48
 
 
 
 
 
 
 
 
49
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  base_model:
3
  - Qwen/Qwen3-VL-8B-Instruct
4
+ datasets:
5
+ - GenSearcher/Train-Data
6
+ library_name: transformers
7
+ pipeline_tag: image-text-to-text
8
+ license: apache-2.0
9
  ---
10
 
11
  # Gen-Searcher-8B Model
12
 
13
+ This repository contains the Gen-Searcher-8B model presented in [Gen-Searcher: Reinforcing Agentic Search for Image Generation](https://arxiv.org/abs/2603.28767).
14
 
15
+ [**Project Page**](https://gen-searcher.vercel.app/) | [**GitHub Repository**](https://github.com/tulerfeng/Gen-Searcher) | [**Paper**](https://arxiv.org/abs/2603.28767)
 
 
 
 
 
 
 
16
 
17
  # 👀 Intro
18
 
19
  <div align="center">
20
+ <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/teaser.jpg?raw=true" alt="Gen-Searcher Teaser" width="80%">
21
  </div>
22
 
 
23
  We introduce **Gen-Searcher**, as the first attempt to train a multimodal **deep research agent** for image generation that requires complex real-world knowledge. Gen-Searcher can **search the web, browse evidence, reason over multiple sources, and search visual references** before generation, enabling more accurate and up-to-date image synthesis in real-world scenarios.
24
 
25
  We build two dedicated training datasets **Gen-Searcher-SFT-10k**, **Gen-Searcher-RL-6k** and one new benchmark **KnowGen** for search-grounded image generation.
 
28
 
29
  All code, models, data, and benchmark are fully released.
30
 
 
 
 
31
  ## 🎥 Demo
32
 
33
  #### Inference Process Example
34
 
35
  <div align="center">
36
+ <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/example.jpg?raw=true" alt="Inference Process Example" width="85%">
37
  </div>
38
 
 
39
  For more examples, please refer to our website [[🌐Project Page]](https://gen-searcher.vercel.app/)
40
 
41
+ ## 🚀 Training and Inference
42
+
43
+ For detailed instructions on setup, SFT/RL training, and inference, please refer to the [official GitHub repository](https://github.com/tulerfeng/Gen-Searcher).
44
+
45
+ ## 📐 Citation
46
+
47
+ If you find our work helpful for your research, please consider citing our work:
48
 
49
+ ```bibtex
50
+ @article{feng2025gensearcher,
51
+ title={Gen-Searcher: Reinforcing Agentic Search for Image Generation},
52
+ author={Feng, Kaituo and Zhang, Manyuan and Chen, Shuang and Lin, Yunlong and Fan, Kaixuan and Jiang, Yilei and Li, Hongyu and Zheng, Dian and Wang, Chenyang and Yue, Xiangyu},
53
+ journal={arXiv preprint arXiv:2603.28767},
54
+ year={2025}
55
+ }
56
+ ```