| --- |
| base_model: |
| - Qwen/Qwen3-VL-8B-Instruct |
| datasets: |
| - GenSearcher/Train-Data |
| library_name: transformers |
| pipeline_tag: image-text-to-text |
| license: apache-2.0 |
| --- |
| |
| # Gen-Searcher-8B Model |
|
|
| This repository contains the Gen-Searcher-8B model presented in [Gen-Searcher: Reinforcing Agentic Search for Image Generation](https://arxiv.org/abs/2603.28767). |
|
|
| [**Project Page**](https://gen-searcher.vercel.app/) | [**GitHub Repository**](https://github.com/tulerfeng/Gen-Searcher) | [**Paper**](https://arxiv.org/abs/2603.28767) |
|
|
| # ๐ Intro |
|
|
| <div align="center"> |
| <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/teaser.jpg?raw=true" alt="Gen-Searcher Teaser" width="80%"> |
| </div> |
|
|
| We introduce **Gen-Searcher**, as the first attempt to train a multimodal **deep research agent** for image generation that requires complex real-world knowledge. Gen-Searcher can **search the web, browse evidence, reason over multiple sources, and search visual references** before generation, enabling more accurate and up-to-date image synthesis in real-world scenarios. |
|
|
| We build two dedicated training datasets **Gen-Searcher-SFT-10k**, **Gen-Searcher-RL-6k** and one new benchmark **KnowGen** for search-grounded image generation. |
|
|
| Gen-Searcher achieves significant improvements, delivering **15+ point gains on the KnowGen and WISE benchmarks**. It also demonstrates **strong transferability** to various image generators. |
|
|
| All code, models, data, and benchmark are fully released. |
|
|
| ## ๐ฅ Demo |
|
|
| #### Inference Process Example |
|
|
| <div align="center"> |
| <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/example.jpg?raw=true" alt="Inference Process Example" width="85%"> |
| </div> |
|
|
| For more examples, please refer to our website [[๐Project Page]](https://gen-searcher.vercel.app/) |
|
|
| ## ๐ Training and Inference |
|
|
| For detailed instructions on setup, SFT/RL training, and inference, please refer to the [official GitHub repository](https://github.com/tulerfeng/Gen-Searcher). |
|
|
| ## ๐ Citation |
|
|
| If you find our work helpful for your research, please consider citing our work: |
|
|
| ```bibtex |
| @article{feng2025gensearcher, |
| title={Gen-Searcher: Reinforcing Agentic Search for Image Generation}, |
| author={Feng, Kaituo and Zhang, Manyuan and Chen, Shuang and Lin, Yunlong and Fan, Kaixuan and Jiang, Yilei and Li, Hongyu and Zheng, Dian and Wang, Chenyang and Yue, Xiangyu}, |
| journal={arXiv preprint arXiv:2603.28767}, |
| year={2025} |
| } |
| ``` |