| --- |
| license: apache-2.0 |
| library_name: transformers |
| pipeline_tag: image-text-to-text |
| --- |
| |
| # Gen-Searcher SFT Model |
|
|
| This repository contains the Supervised Fine-Tuning (SFT) model presented in the paper: [Gen-Searcher: Reinforcing Agentic Search for Image Generation](https://arxiv.org/abs/2603.28767). |
|
|
| This is an intermediate model prepared for subsequent reinforcement learning (RL) training using the GRPO algorithm with dual reward feedback. |
|
|
| [**π Project Page**](https://gen-searcher.vercel.app/) | [**π» Code**](https://github.com/tulerfeng/Gen-Searcher) | [**π Paper**](https://arxiv.org/abs/2603.28767) |
|
|
| # π Intro |
|
|
| <div align="center"> |
| <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/teaser.jpg?raw=true" alt="Gen-Searcher Teaser" width="80%"> |
| </div> |
|
|
| We introduce **Gen-Searcher**, as the first attempt to train a multimodal **deep research agent** for image generation that requires complex real-world knowledge. Gen-Searcher can **search the web, browse evidence, reason over multiple sources, and search visual references** before generation, enabling more accurate and up-to-date image synthesis in real-world scenarios. |
|
|
| We build two dedicated training datasets **Gen-Searcher-SFT-10k**, **Gen-Searcher-RL-6k** and one new benchmark **KnowGen** for search-grounded image generation. |
|
|
| Gen-Searcher achieves significant improvements, delivering **15+ point gains on the KnowGen and WISE benchmarks**. It also demonstrates **strong transferability** to various image generators. |
|
|
| All code, models, data, and benchmark are fully released. |
|
|
| ## π₯ Demo |
|
|
| #### Inference Process Example |
|
|
| <div align="center"> |
| <img src="https://github.com/tulerfeng/Gen-Searcher/blob/main/assets/example.jpg?raw=true" alt="Inference Process Example" width="85%"> |
| </div> |
|
|
| For more examples, please refer to our website [[π Project Page]](https://gen-searcher.vercel.app/). |
|
|
| ## Citation |
|
|
| If you find our work helpful for your research, please consider citing our work: |
|
|
| ```bibtex |
| @article{feng2026gen, |
| title={Gen-Searcher: Reinforcing Agentic Search for Image Generation}, |
| author={Feng, Kaituo and Zhang, Manyuan and Chen, Shuang and Lin, Yunlong and Fan, Kaixuan and Jiang, Yilei and Li, Hongyu and Zheng, Dian and Wang, Chenyang and Yue, Xiangyu}, |
| journal={arXiv preprint arXiv:2603.28767}, |
| year={2026} |
| } |
| ``` |