| | --- |
| | pipeline_tag: any-to-any |
| | library_name: transformers |
| | tags: |
| | - text-to-image |
| | - image-editing |
| | - image-understanding |
| | - vision-language |
| | - multimodal |
| | - unified-model |
| | license: mit |
| | --- |
| | |
| | ## 🌌 Unipic3-Consistency-Model |
| | <div align="center"> |
| | <img src="skywork-logo.png" alt="Skywork Logo" width="500"> |
| | </div> |
| |
|
| | <p align="center"> |
| | <a href="https://github.com/SkyworkAI/UniPic"> |
| | <img src="https://img.shields.io/badge/GitHub-UniPic-blue?logo=github" alt="GitHub Repo"> |
| | </a> |
| | <a href="https://github.com/SkyworkAI/UniPic/stargazers"> |
| | <img src="https://img.shields.io/github/stars/SkyworkAI/UniPic?style=social" alt="GitHub Stars"> |
| | </a> |
| | <a href="https://github.com/SkyworkAI/UniPic/network/members"> |
| | <img src="https://img.shields.io/github/forks/SkyworkAI/UniPic?style=social" alt="GitHub Forks"> |
| | </a> |
| | </p> |
| | |
| | ## 📖 Introduction |
| | <div align="center"> <img src="unipic3.png" alt="Model Teaser" width="720"> </div> |
| |
|
| | **UniPic3-Consistency-Model** is a few-step image editing and multi-image composition model based on **Consistency Flow Matching (CM)**. |
| | The model learns a *trajectory-consistent* mapping from noisy latent states to clean images, enabling stable generation with strong structural consistency. |
| | It is distilled from **UniPic-3** to support **fast inference (≤8 steps)** while preserving composition correctness.The model is especially suitable for scenarios requiring **geometric alignment** and **semantic coherence**, such as multi-image composition and human–object interaction (HOI). |
| |
|
| | ## 📊 Benchmarks |
| | <div align="center"> <img src="unipic3_eval.png" alt="Model Teaser" width="720"> </div> |
| |
|
| |
|
| | ## 🧠 Usage |
| |
|
| | ### 1. Clone the Repository |
| | ```bash |
| | git clone https://github.com/SkyworkAI/UniPic |
| | cd UniPic-3 |
| | ``` |
| |
|
| | ### 2. Set Up the Environment |
| | ```bash |
| | conda create -n unipic python=3.10 |
| | conda activate unipic3 |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| |
|
| | ### 3.Batch Inference |
| | ```bash |
| | transformer_path = "Skywork/Unipic3-Consistency-Model/ema_transformer" |
| | |
| | python -m torch.distributed.launch --nproc_per_node=1 --master_port 29501 --use_env \ |
| | qwen_image_edit_fast/batch_inference.py \ |
| | --jsonl_path data/val.jsonl \ |
| | --output_dir work_dirs/output \ |
| | --distributed \ |
| | --num_inference_steps 8 \ |
| | --true_cfg_scale 4.0 \ |
| | --transformer transformer_path \ |
| | --skip_existing |
| | ``` |
| |
|
| | ## 📄 License |
| | This model is released under the MIT License. |
| |
|
| | ## Citation |
| | If you use Skywork UniPic 3.0 in your research, please cite: |
| | ``` |
| | @article{wei2026skywork, |
| | title={Skywork UniPic 3.0: Unified Multi-Image Composition via Sequence Modeling}, |
| | author={Wei, Hongyang and Liu, Hongbo and Wang, Zidong and Peng, Yi and Xu, Baixin and Wu, Size and Zhang, Xuying and He, Xianglong and Liu, Zexiang and Wang, Peiyu and others}, |
| | journal={arXiv preprint arXiv:2601.15664}, |
| | year={2026} |
| | } |
| | ``` |
| |
|
| |
|