PosterOmni_v1 / README.md
Ephemeral182's picture
Update README.md
0889e9b verified
---
license: apache-2.0
language:
- en
- zh
pipeline_tag: image-to-image
tags:
- QwenImageEditPlusPipeline
---
<div align="center">
<h1>🎨 PosterOmni<br/>Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback</h1>
<img src="images/logo_white.png" alt="PosterOmni Logo" width="180"/>
[![arXiv](https://img.shields.io/badge/arXiv-2602.12127-red)](https://arxiv.org/abs/2602.12127)
[![GitHub](https://img.shields.io/badge/GitHub-Repository-blue)](https://github.com/Ephemeral182/PosterOmni)
[![HuggingFace](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/MeiGen-AI/PosterOmni_v1)
[![Diffusers](https://img.shields.io/badge/Diffusers-Compatible-9cf)](https://github.com/huggingface/diffusers)
[![Website](https://img.shields.io/badge/🌐-Website-brightgreen)](https://ephemeral182.github.io/PosterOmni/)
</div>
---
## ✨ Overview
**PosterOmni** is a unified image-to-poster framework that bridges two regimes in poster creation:
- **Poster Local Editing**: *Rescaling, Filling, Extending, Identity-driven*
- **Poster Global Creation**: *Layout-driven, Style-driven*
- **Unified Training**: *Task distillation + unified reward feedback*.
<div align="center">
<img src="images/teaser_0209.jpg" alt="PosterOmni Teaser" width="1000"/>
</div>
> This Hugging Face repository currently provides **PosterOmni-v1 transformer weights** (component-only).
> Other components (VAE / text encoder / tokenizer / scheduler / processor) should be loaded from a compatible base pipeline.
## 🔥 News
- 📄 **[2026.02]** Paper available on arXiv.
- 🤗 **[2026.02]** PosterOmni-v1 transformer weights released on Hugging Face.
---
## 🚀 Quick Start
### 1) Installation
```bash
git clone https://github.com/Ephemeral182/PosterOmni.git
cd PosterOmni
conda create -n posteromni python=3.11 -y
conda activate posteromni
pip install -r requirements.txt
````
### 2) Load with `QwenImageEditPlusPipeline` (Transformer from this repo)
This repo provides **PosterOmni-v1 transformer weights** (Diffusers component).
You should load the full **QwenImageEditPlusPipeline** from a compatible base model, then **replace its `transformer`** with our weights.
```python
import torch
from PIL import Image
from diffusers import QwenImageEditPlusPipeline
device = "cuda" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.bfloat16 if device.startswith("cuda") else torch.float32
base_model = "Qwen/Qwen-Image-Edit-Plus" # <-- change to your base
pipe = QwenImageEditPlusPipeline.from_pretrained(base_model, torch_dtype=torch_dtype)
pipe.tokenizer_max_length = 1024
pipe = pipe.to(device)
# ----------------------
Load PosterOmni transformer from this repo and plug in
# ----------------------
posteromni_transformer_id = "MeiGen-AI/PosterOmni_v1"
pipe.transformer = pipe.transformer.__class__.from_pretrained(
posteromni_transformer_id,
torch_dtype=torch_dtype,
)
img = Image.open("your_input.jpg").convert("RGB")
# make width/height multiples of 16 (recommended)
w, h = img.size
w = round(w / 16) * 16
h = round(h / 16) * 16
prompt = "Rescale image to 1:1" # <-- must contain "to"
generator = torch.Generator(device=device).manual_seed(42)
out = pipe(
image=[img],
prompt=prompt,
negative_prompt="",
width=w,
height=h,
num_inference_steps=40,
true_cfg_scale=4.0,
guidance_scale=1.0,
generator=generator,
).images[0]
out.save("posteromni_test.png")
print("Saved to posteromni_test.png")
```
> ⚠️ Notes:
>
> * This repository is **component-only**. If you try `FluxPipeline.from_pretrained("MeiGen-AI/PosterOmni_v1")` it will not work because it lacks `model_index.json` and other pipeline components.
> * Use a compatible base pipeline and replace its `transformer` with PosterOmni weights as shown above.
---
## 🧠 Method (High-level)
PosterOmni is trained with a four-stage workflow:
1. **Task-specific SFT**: train specialized experts for local editing and global creation tasks.
2. **Task Distillation**: distill expert knowledge into a single multi-task model.
3. **Unified Reward Training**: learn a universal reward for text fidelity, visual consistency, and aesthetics.
4. **Omni-Edit Reinforcement Learning**: further align the model with unified reward feedback.
<div align="center">
<img src="images/overview.jpg" alt="PosterOmni Model Architecture" width="1000"/>
</div>
---
## 📚 PosterOmni Dataset
We introduce a unified data suite with **PosterOmni-200K** (training) and **PosterOmni-Bench** (evaluation) for image-to-poster generation.
PosterOmni-200K contains **200K+ paired samples** covering six tasks—**local editing** (Rescaling, Filling, Extending, Identity-driven) and **global creation** (Layout-driven, Style-driven)—and spans six poster themes: **Products, Food, Events/Travel, Nature, Education, Entertainment**.
PosterOmni-Bench provides **540 Chinese** and **480 English** prompts, evenly distributed across the same six themes for consistent evaluation across tasks.
<div align="center">
<img src="images/posteromni_datapipeline.jpg" alt="PosterOmni Dataset Construction Pipeline" width="1000"/>
</div>
---
## 📊 Performance Benchmarks
<div align="center">
<img src="images/results.png" alt="PosterOmni Results" width="1000"/>
</div>
---
## 🧩 Supported Tasks
| Regime | Tasks |
| ---------------------- | ------------------------------------------------- |
| Poster Local Editing | Rescaling · Filling · Extending · Identity-driven |
| Poster Global Creation | Layout-driven · Style-driven |
---
## 🔗 Related Project
We also have another text-to-poster work that may interest you:
> **PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework**
> [![GitHub](https://img.shields.io/badge/GitHub-PosterCraft-black?logo=github)](https://github.com/Ephemeral182/PosterCraft)
> [![arXiv](https://img.shields.io/badge/arXiv-2506.10741-red)](https://arxiv.org/abs/2506.10741)
> [![Project](https://img.shields.io/badge/Project%20Page-Visit-blue)](https://ephemeral182.github.io/PosterCraft/)
> [![HF](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/PosterCraft/PosterCraft-v1_RL)
---
## 📌 Model Files
This repository provides:
* `config.json`
* `diffusion_pytorch_model-*.safetensors`
* `diffusion_pytorch_model.safetensors.index.json`
(i.e., **Transformer2DModel** weights in Diffusers format.)
---
## 📬 Contact
**Sixiang Chen**: `schen691@connect.hkust-gz.edu.cn`
**Jianyu Lai**: `jlai218@connect.hkust-gz.edu.cn`
**Jialin Gao**: `gaojialin04@meituan.com`
**Hengyu Shi**: `qq1842084@gmail.com`
**Zhongying Liu**: `liuzhongying@meituan.com`
---
## 📝 Citation
If you find PosterOmni useful for your research, please cite:
```bibtex
@article{chen2026posteromni,
title={PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback},
author={Chen, Sixiang and Lai, Jianyu and Gao, Jialin and Shi, Hengyu and Liu, Zhongying and Ye, Tian and Luo, Junfeng and Wei, Xiaoming and Zhu, Lei},
journal={arXiv preprint arXiv:2602.12127},
year={2026}
}
```