PosterOmni_v1 / README.md

Update README.md

0889e9b verified 6 days ago

7.26 kB

	---
	license: apache-2.0
	language:
	- en
	- zh
	pipeline_tag: image-to-image
	tags:
	- QwenImageEditPlusPipeline
	---

	<div align="center">

	<h1>🎨 PosterOmni<br/>Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback</h1>

	<img src="images/logo_white.png" alt="PosterOmni Logo" width="180"/>

	[![arXiv](https://img.shields.io/badge/arXiv-2602.12127-red)](https://arxiv.org/abs/2602.12127)
	[![GitHub](https://img.shields.io/badge/GitHub-Repository-blue)](https://github.com/Ephemeral182/PosterOmni)
	[![HuggingFace](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/MeiGen-AI/PosterOmni_v1)
	[![Diffusers](https://img.shields.io/badge/Diffusers-Compatible-9cf)](https://github.com/huggingface/diffusers)
	[![Website](https://img.shields.io/badge/🌐-Website-brightgreen)](https://ephemeral182.github.io/PosterOmni/)

	</div>

	---

	## ✨ Overview

	PosterOmni is a unified image-to-poster framework that bridges two regimes in poster creation:

	- Poster Local Editing: Rescaling, Filling, Extending, Identity-driven
	- Poster Global Creation: Layout-driven, Style-driven
	- Unified Training: Task distillation + unified reward feedback.

	<div align="center">
	<img src="images/teaser_0209.jpg" alt="PosterOmni Teaser" width="1000"/>
	</div>

	> This Hugging Face repository currently provides PosterOmni-v1 transformer weights (component-only).
	> Other components (VAE / text encoder / tokenizer / scheduler / processor) should be loaded from a compatible base pipeline.



	## 🔥 News

	- 📄 [2026.02] Paper available on arXiv.
	- 🤗 [2026.02] PosterOmni-v1 transformer weights released on Hugging Face.

	---

	## 🚀 Quick Start

	### 1) Installation


	```bash
	git clone https://github.com/Ephemeral182/PosterOmni.git
	cd PosterOmni

	conda create -n posteromni python=3.11 -y
	conda activate posteromni

	pip install -r requirements.txt
	````

	### 2) Load with `QwenImageEditPlusPipeline` (Transformer from this repo)

	This repo provides PosterOmni-v1 transformer weights (Diffusers component).
	You should load the full QwenImageEditPlusPipeline from a compatible base model, then replace its `transformer` with our weights.

	```python
	import torch
	from PIL import Image
	from diffusers import QwenImageEditPlusPipeline

	device = "cuda" if torch.cuda.is_available() else "cpu"
	torch_dtype = torch.bfloat16 if device.startswith("cuda") else torch.float32

	base_model = "Qwen/Qwen-Image-Edit-Plus" # <-- change to your base
	pipe = QwenImageEditPlusPipeline.from_pretrained(base_model, torch_dtype=torch_dtype)
	pipe.tokenizer_max_length = 1024
	pipe = pipe.to(device)

	# ----------------------
	Load PosterOmni transformer from this repo and plug in
	# ----------------------
	posteromni_transformer_id = "MeiGen-AI/PosterOmni_v1"
	pipe.transformer = pipe.transformer.__class__.from_pretrained(
	posteromni_transformer_id,
	torch_dtype=torch_dtype,
	)

	img = Image.open("your_input.jpg").convert("RGB")

	# make width/height multiples of 16 (recommended)
	w, h = img.size
	w = round(w / 16) * 16
	h = round(h / 16) * 16

	prompt = "Rescale image to 1:1" # <-- must contain "to"
	generator = torch.Generator(device=device).manual_seed(42)

	out = pipe(
	image=[img],
	prompt=prompt,
	negative_prompt="",
	width=w,
	height=h,
	num_inference_steps=40,
	true_cfg_scale=4.0,
	guidance_scale=1.0,
	generator=generator,
	).images[0]

	out.save("posteromni_test.png")
	print("Saved to posteromni_test.png")
	```

	> ⚠️ Notes:
	>
	> * This repository is component-only. If you try `FluxPipeline.from_pretrained("MeiGen-AI/PosterOmni_v1")` it will not work because it lacks `model_index.json` and other pipeline components.
	> * Use a compatible base pipeline and replace its `transformer` with PosterOmni weights as shown above.

	---

	## 🧠 Method (High-level)

	PosterOmni is trained with a four-stage workflow:

	1. Task-specific SFT: train specialized experts for local editing and global creation tasks.
	2. Task Distillation: distill expert knowledge into a single multi-task model.
	3. Unified Reward Training: learn a universal reward for text fidelity, visual consistency, and aesthetics.
	4. Omni-Edit Reinforcement Learning: further align the model with unified reward feedback.

	<div align="center">
	<img src="images/overview.jpg" alt="PosterOmni Model Architecture" width="1000"/>
	</div>

	---

	## 📚 PosterOmni Dataset

	We introduce a unified data suite with PosterOmni-200K (training) and PosterOmni-Bench (evaluation) for image-to-poster generation.
	PosterOmni-200K contains 200K+ paired samples covering six tasks—local editing (Rescaling, Filling, Extending, Identity-driven) and global creation (Layout-driven, Style-driven)—and spans six poster themes: Products, Food, Events/Travel, Nature, Education, Entertainment.
	PosterOmni-Bench provides 540 Chinese and 480 English prompts, evenly distributed across the same six themes for consistent evaluation across tasks.

	<div align="center">
	<img src="images/posteromni_datapipeline.jpg" alt="PosterOmni Dataset Construction Pipeline" width="1000"/>
	</div>

	---

	## 📊 Performance Benchmarks

	<div align="center">
	<img src="images/results.png" alt="PosterOmni Results" width="1000"/>
	</div>

	---

	## 🧩 Supported Tasks

	\| Regime \| Tasks \|
	\| ---------------------- \| ------------------------------------------------- \|
	\| Poster Local Editing \| Rescaling · Filling · Extending · Identity-driven \|
	\| Poster Global Creation \| Layout-driven · Style-driven \|

	---

	## 🔗 Related Project

	We also have another text-to-poster work that may interest you:

	> PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
	> [![GitHub](https://img.shields.io/badge/GitHub-PosterCraft-black?logo=github)](https://github.com/Ephemeral182/PosterCraft)
	> [![arXiv](https://img.shields.io/badge/arXiv-2506.10741-red)](https://arxiv.org/abs/2506.10741)
	> [![Project](https://img.shields.io/badge/Project%20Page-Visit-blue)](https://ephemeral182.github.io/PosterCraft/)
	> [![HF](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/PosterCraft/PosterCraft-v1_RL)

	---

	## 📌 Model Files

	This repository provides:

	* `config.json`
	* `diffusion_pytorch_model-*.safetensors`
	* `diffusion_pytorch_model.safetensors.index.json`

	(i.e., Transformer2DModel weights in Diffusers format.)

	---

	## 📬 Contact

	Sixiang Chen: `schen691@connect.hkust-gz.edu.cn`

	Jianyu Lai: `jlai218@connect.hkust-gz.edu.cn`

	Jialin Gao: `gaojialin04@meituan.com`

	Hengyu Shi: `qq1842084@gmail.com`

	Zhongying Liu: `liuzhongying@meituan.com`


	---

	## 📝 Citation

	If you find PosterOmni useful for your research, please cite:

	```bibtex
	@article{chen2026posteromni,
	title={PosterOmni: Generalized Artistic Poster Creation via Task Distillation and Unified Reward Feedback},
	author={Chen, Sixiang and Lai, Jianyu and Gao, Jialin and Shi, Hengyu and Liu, Zhongying and Ye, Tian and Luo, Junfeng and Wei, Xiaoming and Zhu, Lei},
	journal={arXiv preprint arXiv:2602.12127},
	year={2026}
	}
	```