File size: 2,114 Bytes
85a3dd9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
language: en
library_name: pytorch-image-translation-models
pipeline_tag: image-to-image
tags:
  - image-to-image
  - diffusion
  - image-translation
  - DiffuseIT
  - text-guided
  - style-transfer
---

# DiffuseIT Checkpoints

Diffusion-based Image Translation using Disentangled Style and Content Representation ([Kwon & Ye, ICLR 2023](https://arxiv.org/abs/2209.15264)).

Converted from [cyclomon/DiffuseIT](https://github.com/cyclomon/DiffuseIT) for use with `pytorch-image-translation-models`.

## Model Variants

| Subfolder | Dataset | Resolution | Description |
|-----------|---------|------------|-------------|
| [imagenet256-uncond](imagenet256-uncond/) | ImageNet | 256×256 | Unconditional diffusion model for general image translation |
| [ffhq-256](ffhq-256/) | FFHQ | 256×256 | Face-focused model with identity preservation (self-contained: unet + id_model) |

## Installation

```bash
pip install pytorch-image-translation-models
```

Clone DiffuseIT repository (required for CLIP, VIT losses):

```bash
git clone https://github.com/cyclomon/DiffuseIT.git projects/DiffuseIT
cd projects/DiffuseIT
pip install ftfy regex lpips kornia opencv-python color-matcher
pip install git+https://github.com/openai/CLIP.git
```

## Usage

```python
from examples.community.diffuseit import load_diffuseit_community_pipeline

# ImageNet 256
pipe = load_diffuseit_community_pipeline(
    "BiliSakura/DiffuseIT-ckpt/imagenet256-uncond",  # or local path
    diffuseit_src_path="projects/DiffuseIT",
)
pipe.to("cuda")

# Text-guided
out = pipe(
    source_image=img,
    prompt="Black Leopard",
    source="Lion",
    use_range_restart=True,
    use_noise_aug_all=True,
    output_type="pil",
)

# Image-guided
out = pipe(
    source_image=img,
    target_image=style_ref,
    use_colormatch=True,
    output_type="pil",
)
```

## Citation

```bibtex
@inproceedings{kwon2023diffuseit,
  title={Diffusion-based Image Translation using Disentangled Style and Content Representation},
  author={Kwon, Gihyun and Ye, Jong Chul},
  booktitle={ICLR},
  year={2023},
  url={https://arxiv.org/abs/2209.15264}
}
```