|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- zh |
|
|
pipeline_tag: image-to-image |
|
|
tags: |
|
|
- QwenImageEditPlusPipeline |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
<h1>🎨 PosterOmni<br/>Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback</h1> |
|
|
|
|
|
<img src="images/logo_white.png" alt="PosterOmni Logo" width="180"/> |
|
|
|
|
|
[](https://arxiv.org/abs/2602.12127) |
|
|
[](https://github.com/Ephemeral182/PosterOmni) |
|
|
[](https://huggingface.co/MeiGen-AI/PosterOmni_v1) |
|
|
[](https://github.com/huggingface/diffusers) |
|
|
[](https://ephemeral182.github.io/PosterOmni/) |
|
|
|
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## ✨ Overview |
|
|
|
|
|
**PosterOmni** is a unified image-to-poster framework that bridges two regimes in poster creation: |
|
|
|
|
|
- **Poster Local Editing**: *Rescaling, Filling, Extending, Identity-driven* |
|
|
- **Poster Global Creation**: *Layout-driven, Style-driven* |
|
|
- **Unified Training**: *Task distillation + unified reward feedback*. |
|
|
|
|
|
<div align="center"> |
|
|
<img src="images/teaser_0209.jpg" alt="PosterOmni Teaser" width="1000"/> |
|
|
</div> |
|
|
|
|
|
> This Hugging Face repository currently provides **PosterOmni-v1 transformer weights** (component-only). |
|
|
> Other components (VAE / text encoder / tokenizer / scheduler / processor) should be loaded from a compatible base pipeline. |
|
|
|
|
|
|
|
|
|
|
|
## 🔥 News |
|
|
|
|
|
- 📄 **[2026.02]** Paper available on arXiv. |
|
|
- 🤗 **[2026.02]** PosterOmni-v1 transformer weights released on Hugging Face. |
|
|
|
|
|
--- |
|
|
|
|
|
## 🚀 Quick Start |
|
|
|
|
|
### 1) Installation |
|
|
|
|
|
|
|
|
```bash |
|
|
git clone https://github.com/Ephemeral182/PosterOmni.git |
|
|
cd PosterOmni |
|
|
|
|
|
conda create -n posteromni python=3.11 -y |
|
|
conda activate posteromni |
|
|
|
|
|
pip install -r requirements.txt |
|
|
```` |
|
|
|
|
|
### 2) Load with `QwenImageEditPlusPipeline` (Transformer from this repo) |
|
|
|
|
|
This repo provides **PosterOmni-v1 transformer weights** (Diffusers component). |
|
|
You should load the full **QwenImageEditPlusPipeline** from a compatible base model, then **replace its `transformer`** with our weights. |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from PIL import Image |
|
|
from diffusers import QwenImageEditPlusPipeline |
|
|
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
torch_dtype = torch.bfloat16 if device.startswith("cuda") else torch.float32 |
|
|
|
|
|
base_model = "Qwen/Qwen-Image-Edit-Plus" # <-- change to your base |
|
|
pipe = QwenImageEditPlusPipeline.from_pretrained(base_model, torch_dtype=torch_dtype) |
|
|
pipe.tokenizer_max_length = 1024 |
|
|
pipe = pipe.to(device) |
|
|
|
|
|
# ---------------------- |
|
|
Load PosterOmni transformer from this repo and plug in |
|
|
# ---------------------- |
|
|
posteromni_transformer_id = "MeiGen-AI/PosterOmni_v1" |
|
|
pipe.transformer = pipe.transformer.__class__.from_pretrained( |
|
|
posteromni_transformer_id, |
|
|
torch_dtype=torch_dtype, |
|
|
) |
|
|
|
|
|
img = Image.open("your_input.jpg").convert("RGB") |
|
|
|
|
|
# make width/height multiples of 16 (recommended) |
|
|
w, h = img.size |
|
|
w = round(w / 16) * 16 |
|
|
h = round(h / 16) * 16 |
|
|
|
|
|
prompt = "Rescale image to 1:1" # <-- must contain "to" |
|
|
generator = torch.Generator(device=device).manual_seed(42) |
|
|
|
|
|
out = pipe( |
|
|
image=[img], |
|
|
prompt=prompt, |
|
|
negative_prompt="", |
|
|
width=w, |
|
|
height=h, |
|
|
num_inference_steps=40, |
|
|
true_cfg_scale=4.0, |
|
|
guidance_scale=1.0, |
|
|
generator=generator, |
|
|
).images[0] |
|
|
|
|
|
out.save("posteromni_test.png") |
|
|
print("Saved to posteromni_test.png") |
|
|
``` |
|
|
|
|
|
> ⚠️ Notes: |
|
|
> |
|
|
> * This repository is **component-only**. If you try `FluxPipeline.from_pretrained("MeiGen-AI/PosterOmni_v1")` it will not work because it lacks `model_index.json` and other pipeline components. |
|
|
> * Use a compatible base pipeline and replace its `transformer` with PosterOmni weights as shown above. |
|
|
|
|
|
--- |
|
|
|
|
|
## 🧠 Method (High-level) |
|
|
|
|
|
PosterOmni is trained with a four-stage workflow: |
|
|
|
|
|
1. **Task-specific SFT**: train specialized experts for local editing and global creation tasks. |
|
|
2. **Task Distillation**: distill expert knowledge into a single multi-task model. |
|
|
3. **Unified Reward Training**: learn a universal reward for text fidelity, visual consistency, and aesthetics. |
|
|
4. **Omni-Edit Reinforcement Learning**: further align the model with unified reward feedback. |
|
|
|
|
|
<div align="center"> |
|
|
<img src="images/overview.jpg" alt="PosterOmni Model Architecture" width="1000"/> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## 📚 PosterOmni Dataset |
|
|
|
|
|
We introduce a unified data suite with **PosterOmni-200K** (training) and **PosterOmni-Bench** (evaluation) for image-to-poster generation. |
|
|
PosterOmni-200K contains **200K+ paired samples** covering six tasks—**local editing** (Rescaling, Filling, Extending, Identity-driven) and **global creation** (Layout-driven, Style-driven)—and spans six poster themes: **Products, Food, Events/Travel, Nature, Education, Entertainment**. |
|
|
PosterOmni-Bench provides **540 Chinese** and **480 English** prompts, evenly distributed across the same six themes for consistent evaluation across tasks. |
|
|
|
|
|
<div align="center"> |
|
|
<img src="images/posteromni_datapipeline.jpg" alt="PosterOmni Dataset Construction Pipeline" width="1000"/> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## 📊 Performance Benchmarks |
|
|
|
|
|
<div align="center"> |
|
|
<img src="images/results.png" alt="PosterOmni Results" width="1000"/> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## 🧩 Supported Tasks |
|
|
|
|
|
| Regime | Tasks | |
|
|
| ---------------------- | ------------------------------------------------- | |
|
|
| Poster Local Editing | Rescaling · Filling · Extending · Identity-driven | |
|
|
| Poster Global Creation | Layout-driven · Style-driven | |
|
|
|
|
|
--- |
|
|
|
|
|
## 🔗 Related Project |
|
|
|
|
|
We also have another text-to-poster work that may interest you: |
|
|
|
|
|
> **PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework** |
|
|
> [](https://github.com/Ephemeral182/PosterCraft) |
|
|
> [](https://arxiv.org/abs/2506.10741) |
|
|
> [](https://ephemeral182.github.io/PosterCraft/) |
|
|
> [](https://huggingface.co/PosterCraft/PosterCraft-v1_RL) |
|
|
|
|
|
--- |
|
|
|
|
|
## 📌 Model Files |
|
|
|
|
|
This repository provides: |
|
|
|
|
|
* `config.json` |
|
|
* `diffusion_pytorch_model-*.safetensors` |
|
|
* `diffusion_pytorch_model.safetensors.index.json` |
|
|
|
|
|
(i.e., **Transformer2DModel** weights in Diffusers format.) |
|
|
|
|
|
--- |
|
|
|
|
|
## 📬 Contact |
|
|
|
|
|
**Sixiang Chen**: `schen691@connect.hkust-gz.edu.cn` |
|
|
|
|
|
**Jianyu Lai**: `jlai218@connect.hkust-gz.edu.cn` |
|
|
|
|
|
**Jialin Gao**: `gaojialin04@meituan.com` |
|
|
|
|
|
**Hengyu Shi**: `qq1842084@gmail.com` |
|
|
|
|
|
**Zhongying Liu**: `liuzhongying@meituan.com` |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## 📝 Citation |
|
|
|
|
|
If you find PosterOmni useful for your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@article{chen2026posteromni, |
|
|
title={PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback}, |
|
|
author={Chen, Sixiang and Lai, Jianyu and Gao, Jialin and Shi, Hengyu and Liu, Zhongying and Ye, Tian and Luo, Junfeng and Wei, Xiaoming and Zhu, Lei}, |
|
|
journal={arXiv preprint arXiv:2602.12127}, |
|
|
year={2026} |
|
|
} |
|
|
``` |
|
|
|
|
|
|