--- language: en library_name: pytorch-image-translation-models pipeline_tag: image-to-image tags: - image-to-image - diffusion - image-translation - DiffuseIT - text-guided - style-transfer --- # DiffuseIT Checkpoints Diffusion-based Image Translation using Disentangled Style and Content Representation ([Kwon & Ye, ICLR 2023](https://arxiv.org/abs/2209.15264)). Converted from [cyclomon/DiffuseIT](https://github.com/cyclomon/DiffuseIT) for use with `pytorch-image-translation-models`. ## Model Variants | Subfolder | Dataset | Resolution | Description | |-----------|---------|------------|-------------| | [imagenet256-uncond](imagenet256-uncond/) | ImageNet | 256×256 | Unconditional diffusion model for general image translation | | [ffhq-256](ffhq-256/) | FFHQ | 256×256 | Face-focused model with identity preservation (self-contained: unet + id_model) | ## Installation ```bash pip install pytorch-image-translation-models ``` Clone DiffuseIT repository (required for CLIP, VIT losses): ```bash git clone https://github.com/cyclomon/DiffuseIT.git projects/DiffuseIT cd projects/DiffuseIT pip install ftfy regex lpips kornia opencv-python color-matcher pip install git+https://github.com/openai/CLIP.git ``` ## Usage ```python from examples.community.diffuseit import load_diffuseit_community_pipeline # ImageNet 256 pipe = load_diffuseit_community_pipeline( "BiliSakura/DiffuseIT-ckpt/imagenet256-uncond", # or local path diffuseit_src_path="projects/DiffuseIT", ) pipe.to("cuda") # Text-guided out = pipe( source_image=img, prompt="Black Leopard", source="Lion", use_range_restart=True, use_noise_aug_all=True, output_type="pil", ) # Image-guided out = pipe( source_image=img, target_image=style_ref, use_colormatch=True, output_type="pil", ) ``` ## Citation ```bibtex @inproceedings{kwon2023diffuseit, title={Diffusion-based Image Translation using Disentangled Style and Content Representation}, author={Kwon, Gihyun and Ye, Jong Chul}, booktitle={ICLR}, year={2023}, url={https://arxiv.org/abs/2209.15264} } ```