| --- |
| language: en |
| library_name: pytorch-image-translation-models |
| pipeline_tag: image-to-image |
| tags: |
| - image-to-image |
| - diffusion |
| - image-translation |
| - DiffuseIT |
| - text-guided |
| - style-transfer |
| --- |
| |
| # DiffuseIT Checkpoints |
|
|
| Diffusion-based Image Translation using Disentangled Style and Content Representation ([Kwon & Ye, ICLR 2023](https://arxiv.org/abs/2209.15264)). |
|
|
| Converted from [cyclomon/DiffuseIT](https://github.com/cyclomon/DiffuseIT) for use with `pytorch-image-translation-models`. |
|
|
| ## Model Variants |
|
|
| | Subfolder | Dataset | Resolution | Description | |
| |-----------|---------|------------|-------------| |
| | [imagenet256-uncond](imagenet256-uncond/) | ImageNet | 256×256 | Unconditional diffusion model for general image translation | |
| | [ffhq-256](ffhq-256/) | FFHQ | 256×256 | Face-focused model with identity preservation (self-contained: unet + id_model) | |
| |
| ## Installation |
| |
| ```bash |
| pip install pytorch-image-translation-models |
| ``` |
| |
| Clone DiffuseIT repository (required for CLIP, VIT losses): |
| |
| ```bash |
| git clone https://github.com/cyclomon/DiffuseIT.git projects/DiffuseIT |
| cd projects/DiffuseIT |
| pip install ftfy regex lpips kornia opencv-python color-matcher |
| pip install git+https://github.com/openai/CLIP.git |
| ``` |
| |
| ## Usage |
| |
| ```python |
| from examples.community.diffuseit import load_diffuseit_community_pipeline |
|
|
| # ImageNet 256 |
| pipe = load_diffuseit_community_pipeline( |
| "BiliSakura/DiffuseIT-ckpt/imagenet256-uncond", # or local path |
| diffuseit_src_path="projects/DiffuseIT", |
| ) |
| pipe.to("cuda") |
| |
| # Text-guided |
| out = pipe( |
| source_image=img, |
| prompt="Black Leopard", |
| source="Lion", |
| use_range_restart=True, |
| use_noise_aug_all=True, |
| output_type="pil", |
| ) |
| |
| # Image-guided |
| out = pipe( |
| source_image=img, |
| target_image=style_ref, |
| use_colormatch=True, |
| output_type="pil", |
| ) |
| ``` |
| |
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{kwon2023diffuseit, |
| title={Diffusion-based Image Translation using Disentangled Style and Content Representation}, |
| author={Kwon, Gihyun and Ye, Jong Chul}, |
| booktitle={ICLR}, |
| year={2023}, |
| url={https://arxiv.org/abs/2209.15264} |
| } |
| ``` |
|
|