metadata
language: en
library_name: pytorch-image-translation-models
pipeline_tag: image-to-image
tags:
- image-to-image
- diffusion
- image-translation
- DiffuseIT
- text-guided
- style-transfer
DiffuseIT Checkpoints
Diffusion-based Image Translation using Disentangled Style and Content Representation (Kwon & Ye, ICLR 2023).
Converted from cyclomon/DiffuseIT for use with pytorch-image-translation-models.
Model Variants
| Subfolder | Dataset | Resolution | Description |
|---|---|---|---|
| imagenet256-uncond | ImageNet | 256×256 | Unconditional diffusion model for general image translation |
| ffhq-256 | FFHQ | 256×256 | Face-focused model with identity preservation (self-contained: unet + id_model) |
Installation
pip install pytorch-image-translation-models
Clone DiffuseIT repository (required for CLIP, VIT losses):
git clone https://github.com/cyclomon/DiffuseIT.git projects/DiffuseIT
cd projects/DiffuseIT
pip install ftfy regex lpips kornia opencv-python color-matcher
pip install git+https://github.com/openai/CLIP.git
Usage
from examples.community.diffuseit import load_diffuseit_community_pipeline
# ImageNet 256
pipe = load_diffuseit_community_pipeline(
"BiliSakura/DiffuseIT-ckpt/imagenet256-uncond", # or local path
diffuseit_src_path="projects/DiffuseIT",
)
pipe.to("cuda")
# Text-guided
out = pipe(
source_image=img,
prompt="Black Leopard",
source="Lion",
use_range_restart=True,
use_noise_aug_all=True,
output_type="pil",
)
# Image-guided
out = pipe(
source_image=img,
target_image=style_ref,
use_colormatch=True,
output_type="pil",
)
Citation
@inproceedings{kwon2023diffuseit,
title={Diffusion-based Image Translation using Disentangled Style and Content Representation},
author={Kwon, Gihyun and Ye, Jong Chul},
booktitle={ICLR},
year={2023},
url={https://arxiv.org/abs/2209.15264}
}