aihao commited on
Commit
9890667
·
1 Parent(s): 1e085c7

add init code

Browse files
.gitignore ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Byte-compiled / optimized / DLL files
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # C extensions
7
+ *.so
8
+
9
+ # Distribution / packaging
10
+ .Python
11
+ build/
12
+ develop-eggs/
13
+ dist/
14
+ downloads/
15
+ eggs/
16
+ .eggs/
17
+ lib/
18
+ lib64/
19
+ parts/
20
+ sdist/
21
+ var/
22
+ wheels/
23
+ share/python-wheels/
24
+ *.egg-info/
25
+ .installed.cfg
26
+ *.egg
27
+ MANIFEST
28
+
29
+ # PyInstaller
30
+ # Usually these files are written by a python script from a template
31
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
32
+ *.manifest
33
+ *.spec
34
+
35
+ # Installer logs
36
+ pip-log.txt
37
+ pip-delete-this-directory.txt
38
+
39
+ # Unit test / coverage reports
40
+ htmlcov/
41
+ .tox/
42
+ .nox/
43
+ .coverage
44
+ .coverage.*
45
+ .cache
46
+ nosetests.xml
47
+ coverage.xml
48
+ *.cover
49
+ *.py,cover
50
+ .hypothesis/
51
+ .pytest_cache/
52
+ cover/
53
+
54
+ # Translations
55
+ *.mo
56
+ *.pot
57
+
58
+ # Django stuff:
59
+ *.log
60
+ local_settings.py
61
+ db.sqlite3
62
+ db.sqlite3-journal
63
+
64
+ # Flask stuff:
65
+ instance/
66
+ .webassets-cache
67
+
68
+ # Scrapy stuff:
69
+ .scrapy
70
+
71
+ # Sphinx documentation
72
+ docs/_build/
73
+
74
+ # PyBuilder
75
+ .pybuilder/
76
+ target/
77
+
78
+ # Jupyter Notebook
79
+ .ipynb_checkpoints
80
+
81
+ # IPython
82
+ profile_default/
83
+ ipython_config.py
84
+
85
+ # pyenv
86
+ # For a library or package, you might want to ignore these files since the code is
87
+ # intended to run in multiple environments; otherwise, check them in:
88
+ # .python-version
89
+
90
+ # pipenv
91
+ # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92
+ # However, in case of collaboration, if having platform-specific dependencies or dependencies
93
+ # having no cross-platform support, pipenv may install dependencies that don't work, or not
94
+ # install all needed dependencies.
95
+ #Pipfile.lock
96
+
97
+ # poetry
98
+ # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99
+ # This is especially recommended for binary packages to ensure reproducibility, and is more
100
+ # commonly ignored for libraries.
101
+ # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102
+ #poetry.lock
103
+
104
+ # pdm
105
+ # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106
+ #pdm.lock
107
+ # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108
+ # in version control.
109
+ # https://pdm.fming.dev/latest/usage/project/#working-with-version-control
110
+ .pdm.toml
111
+ .pdm-python
112
+ .pdm-build/
113
+
114
+ # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
115
+ __pypackages__/
116
+
117
+ # Celery stuff
118
+ celerybeat-schedule
119
+ celerybeat.pid
120
+
121
+ # SageMath parsed files
122
+ *.sage.py
123
+
124
+ # Environments
125
+ .env
126
+ .venv
127
+ env/
128
+ venv/
129
+ ENV/
130
+ env.bak/
131
+ venv.bak/
132
+
133
+ # Spyder project settings
134
+ .spyderproject
135
+ .spyproject
136
+
137
+ # Rope project settings
138
+ .ropeproject
139
+
140
+ # mkdocs documentation
141
+ /site
142
+
143
+ # mypy
144
+ .mypy_cache/
145
+ .dmypy.json
146
+ dmypy.json
147
+
148
+ # Pyre type checker
149
+ .pyre/
150
+
151
+ # pytype static type analyzer
152
+ .pytype/
153
+
154
+ # Cython debug symbols
155
+ cython_debug/
156
+
157
+ # PyCharm
158
+ # JetBrains specific template is maintained in a separate JetBrains.gitignore that can
159
+ # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
160
+ # and can be added to the global gitignore or merged into this file. For a more nuclear
161
+ # option (not recommended) you can uncomment the following to ignore the entire idea folder.
162
+ #.idea/
README.assets/example.jpg ADDED

Git LFS Details

  • SHA256: f3528bd6d4ec4c380302c2ec6aa0f62c9e218b961b4923f837acdec9801cc162
  • Pointer size: 131 Bytes
  • Size of remote file: 124 kB
README.assets/main.png ADDED

Git LFS Details

  • SHA256: 67020dadba987aa7e5f4260a9a3cd7e6bcd1b0c28db1dfc2fc7bc8236d4c0c26
  • Pointer size: 132 Bytes
  • Size of remote file: 1.8 MB
README.assets/more_examples.png ADDED

Git LFS Details

  • SHA256: 89e8f3726d676a67cb58191d66b2be1d601f4847837e177e91353b2351a2115d
  • Pointer size: 132 Bytes
  • Size of remote file: 1.44 MB
README.md CHANGED
@@ -1 +1,44 @@
1
- # IP-Adapter-Artist
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # IP Adapter Artist
2
+
3
+ <a href='https://huggingface.co/AisingioroHao0/IP-Adapter-Artist'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'></a><a href=''><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-blue'></a> [![**IP Adapter Artist Demo**](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kV7q3Gzr8GPG9cChdDQ5ncCx84TYjuu3?usp=sharing)
4
+
5
+ ![image-20240807232402569](./README.assets/main.png)
6
+
7
+ ------
8
+
9
+ ## Introduction
10
+
11
+ IP Adapter Artist is a specialized version that uses a professional style encoder. Its goal is to achieve style control through reference images in the text-to-image diffusion model and solve the problems of instability and incomplete stylization of existing methods. This is a preprint version, and more models and training data coming soon.
12
+
13
+ ## How to use
14
+
15
+ [![**IP Adapter Artist Demo**](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1kV7q3Gzr8GPG9cChdDQ5ncCx84TYjuu3?usp=sharing) can be used to conduct experiments directly.
16
+
17
+ For local experiments, please refer to a [demo](https://github.com/aihao2000/IP-Adapter-Artist/blob/main/ip_adapter_artist_sdxl_demo.ipynb).
18
+
19
+ Local experiments require a basic torch environment and dependencies:
20
+
21
+ ```
22
+ pip install diffusers
23
+ pip install transformers
24
+ pip install git+https://github.com/openai/CLIP.git
25
+ pip install git+https://github.com/aihao2000/IP-Adapter-Artist.git
26
+ ```
27
+
28
+ ## More Examples
29
+
30
+ ![image-20240808001612810](./README.assets/more_examples.png)
31
+
32
+
33
+ ## Citation
34
+
35
+ ```
36
+ @misc{IP-Adapter-Artist,
37
+ author = {Hao Ai},
38
+ title = {IP Adapter Artist},
39
+ year = {2024},
40
+ publisher = {GitHub},
41
+ journal = {GitHub repository},
42
+ howpublished = {\url{https://github.com/aihao2000/IP-Adapter-Artist}}
43
+ }
44
+ ```
ip_adapter_artist/__init__.py ADDED
File without changes
ip_adapter_artist/utils/__init__.py ADDED
File without changes
ip_adapter_artist/utils/csd_clip.py ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ import clip
4
+ import copy
5
+ from torch.autograd import Function
6
+
7
+ from collections import OrderedDict
8
+
9
+
10
+ def convert_state_dict(state_dict):
11
+ new_state_dict = OrderedDict()
12
+ for k, v in state_dict.items():
13
+ if k.startswith("module."):
14
+ k = k.replace("module.", "")
15
+ new_state_dict[k] = v
16
+ return new_state_dict
17
+
18
+
19
+ def convert_weights_float(model: nn.Module):
20
+ """Convert applicable model parameters to fp32"""
21
+
22
+ def _convert_weights_to_fp32(l):
23
+ if isinstance(l, (nn.Conv1d, nn.Conv2d, nn.Linear)):
24
+ l.weight.data = l.weight.data.float()
25
+ if l.bias is not None:
26
+ l.bias.data = l.bias.data.float()
27
+
28
+ if isinstance(l, nn.MultiheadAttention):
29
+ for attr in [
30
+ *[f"{s}_proj_weight" for s in ["in", "q", "k", "v"]],
31
+ "in_proj_bias",
32
+ "bias_k",
33
+ "bias_v",
34
+ ]:
35
+ tensor = getattr(l, attr)
36
+ if tensor is not None:
37
+ tensor.data = tensor.data.float()
38
+
39
+ for name in ["text_projection", "proj"]:
40
+ if hasattr(l, name):
41
+ attr = getattr(l, name)
42
+ if attr is not None:
43
+ attr.data = attr.data.float()
44
+
45
+ model.apply(_convert_weights_to_fp32)
46
+
47
+
48
+ class ReverseLayerF(Function):
49
+ @staticmethod
50
+ def forward(ctx, x, alpha):
51
+ ctx.alpha = alpha
52
+
53
+ return x.view_as(x)
54
+
55
+ @staticmethod
56
+ def backward(ctx, grad_output):
57
+ output = grad_output.neg() * ctx.alpha
58
+
59
+ return output, None
60
+
61
+
62
+ ## taken from https://github.com/moein-shariatnia/OpenAI-CLIP/blob/master/modules.py
63
+ class ProjectionHead(nn.Module):
64
+ def __init__(self, embedding_dim, projection_dim, dropout=0):
65
+ super().__init__()
66
+ self.projection = nn.Linear(embedding_dim, projection_dim)
67
+ self.gelu = nn.GELU()
68
+ self.fc = nn.Linear(projection_dim, projection_dim)
69
+ self.dropout = nn.Dropout(dropout)
70
+ self.layer_norm = nn.LayerNorm(projection_dim)
71
+
72
+ def forward(self, x):
73
+ projected = self.projection(x)
74
+ x = self.gelu(projected)
75
+ x = self.fc(x)
76
+ x = self.dropout(x)
77
+ x = x + projected
78
+ x = self.layer_norm(x)
79
+ return x
80
+
81
+
82
+ def init_weights(m): # TODO: do we need init for layernorm?
83
+ if isinstance(m, nn.Linear):
84
+ torch.nn.init.xavier_uniform_(m.weight)
85
+ if m.bias is not None:
86
+ nn.init.normal_(m.bias, std=1e-6)
87
+
88
+
89
+ class CSD_CLIP(nn.Module):
90
+ """backbone + projection head"""
91
+
92
+ def __init__(self, name="vit_large", content_proj_head="default", model_path=None):
93
+ super(CSD_CLIP, self).__init__()
94
+ self.content_proj_head = content_proj_head
95
+ if name == "vit_large":
96
+ if model_path is None:
97
+ clipmodel, _ = clip.load("models/ViT-L-14.pt")
98
+ else:
99
+ clipmodel, _ = clip.load(model_path)
100
+ self.backbone = clipmodel.visual
101
+ self.embedding_dim = 1024
102
+ elif name == "vit_base":
103
+ if model_path is None:
104
+ clipmodel, _ = clip.load("ViT-B/16")
105
+ else:
106
+ clipmodel, _ = clip.load(model_path)
107
+ self.backbone = clipmodel.visual
108
+ self.embedding_dim = 768
109
+ self.feat_dim = 512
110
+ else:
111
+ raise Exception("This model is not implemented")
112
+
113
+ convert_weights_float(self.backbone)
114
+ self.last_layer_style = copy.deepcopy(self.backbone.proj)
115
+ if content_proj_head == "custom":
116
+ self.last_layer_content = ProjectionHead(self.embedding_dim, self.feat_dim)
117
+ self.last_layer_content.apply(init_weights)
118
+
119
+ else:
120
+ self.last_layer_content = copy.deepcopy(self.backbone.proj)
121
+
122
+ self.backbone.proj = None
123
+
124
+ @property
125
+ def dtype(self):
126
+ return self.backbone.conv1.weight.dtype
127
+
128
+ def forward(self, input_data, alpha=None):
129
+ feature = self.backbone(input_data)
130
+
131
+ if alpha is not None:
132
+ reverse_feature = ReverseLayerF.apply(feature, alpha)
133
+ else:
134
+ reverse_feature = feature
135
+
136
+ style_output = feature @ self.last_layer_style
137
+ style_output = nn.functional.normalize(style_output, dim=1, p=2)
138
+
139
+ # if alpha is not None:
140
+ if self.content_proj_head == "custom":
141
+ content_output = self.last_layer_content(reverse_feature)
142
+ else:
143
+ content_output = reverse_feature @ self.last_layer_content
144
+ content_output = nn.functional.normalize(content_output, dim=1, p=2)
145
+ return feature, content_output, style_output
ip_adapter_artist/utils/ip_adapter.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from diffusers.models.attention_processor import IPAdapterAttnProcessor2_0, Attention
2
+ from diffusers.models.embeddings import (
3
+ ImageProjection,
4
+ MultiIPAdapterImageProjection,
5
+ IPAdapterPlusImageProjection,
6
+ )
7
+ import torch
8
+
9
+
10
+ def save_ip_adapter(unet, path):
11
+ state_dict = {}
12
+ if (
13
+ hasattr(unet, "encoder_hid_proj")
14
+ and unet.encoder_hid_proj is not None
15
+ and isinstance(unet.encoder_hid_proj, torch.nn.Module)
16
+ ):
17
+ state_dict["encoder_hid_proj"] = unet.encoder_hid_proj.state_dict()
18
+
19
+ for name, module in unet.attn_processors.items():
20
+ if isinstance(module, torch.nn.Module):
21
+ state_dict[name] = module.state_dict()
22
+ torch.save(state_dict, path)
23
+
24
+
25
+ def load_ip_adapter(
26
+ unet,
27
+ path,
28
+ ):
29
+ state_dict = torch.load(path, map_location="cpu")
30
+
31
+ if "encoder_hid_proj" in state_dict.keys():
32
+ num_image_text_embeds = 4
33
+ clip_embeddings_dim = state_dict["encoder_hid_proj"][
34
+ "image_projection_layers.0.image_embeds.weight"
35
+ ].shape[-1]
36
+ cross_attention_dim = (
37
+ state_dict["encoder_hid_proj"][
38
+ "image_projection_layers.0.image_embeds.weight"
39
+ ].shape[0]
40
+ // 4
41
+ )
42
+ if not hasattr(unet, "encoder_hid_proj") or unet.encoder_hid_proj is None:
43
+ unet.encoder_hid_proj = MultiIPAdapterImageProjection(
44
+ [
45
+ ImageProjection(
46
+ cross_attention_dim=cross_attention_dim,
47
+ image_embed_dim=clip_embeddings_dim,
48
+ num_image_text_embeds=num_image_text_embeds,
49
+ )
50
+ ]
51
+ ).to(unet.device, unet.dtype)
52
+ unet.encoder_hid_proj.load_state_dict(state_dict["encoder_hid_proj"])
53
+ else:
54
+ unet.encoder_hid_proj = lambda x: x
55
+ cross_attention_dim = state_dict[
56
+ "down_blocks.1.attentions.0.transformer_blocks.0.attn2.processor"
57
+ ]["to_k_ip.0.weight"].shape[-1]
58
+
59
+ unet.config.encoder_hid_dim_type = "ip_image_proj"
60
+
61
+ for name, module in unet.named_modules():
62
+ if "attn2" in name and isinstance(module, Attention):
63
+ if not isinstance(module.processor, IPAdapterAttnProcessor2_0):
64
+ module.set_processor(
65
+ IPAdapterAttnProcessor2_0(
66
+ hidden_size=module.query_dim,
67
+ cross_attention_dim=cross_attention_dim,
68
+ ).to(unet.device, unet.dtype)
69
+ )
70
+ module.processor.load_state_dict(
71
+ state_dict[f"{name}.processor"], strict=False
72
+ )
ip_adapter_artist_sdxl_demo.ipynb ADDED
@@ -0,0 +1,210 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": null,
6
+ "metadata": {},
7
+ "outputs": [],
8
+ "source": [
9
+ "from ip_adapter_artist.utils.csd_clip import CSD_CLIP\n",
10
+ "from ip_adapter_artist.utils.ip_adapter import (\n",
11
+ " load_ip_adapter,\n",
12
+ ")\n",
13
+ "import torch\n",
14
+ "from transformers import CLIPImageProcessor\n",
15
+ "from PIL import Image\n",
16
+ "from diffusers.utils import make_image_grid,load_image\n",
17
+ "from huggingface_hub import hf_hub_download\n",
18
+ "from diffusers import StableDiffusionXLPipeline"
19
+ ]
20
+ },
21
+ {
22
+ "attachments": {},
23
+ "cell_type": "markdown",
24
+ "metadata": {},
25
+ "source": [
26
+ "## Download Models"
27
+ ]
28
+ },
29
+ {
30
+ "cell_type": "code",
31
+ "execution_count": null,
32
+ "metadata": {},
33
+ "outputs": [],
34
+ "source": [
35
+ "csd_clip_path = hf_hub_download(\n",
36
+ " repo_id=\"AisingioroHao0/IP-Adapter-Artist\", filename=\"csd_clip.pth\"\n",
37
+ ")"
38
+ ]
39
+ },
40
+ {
41
+ "cell_type": "code",
42
+ "execution_count": null,
43
+ "metadata": {},
44
+ "outputs": [],
45
+ "source": [
46
+ "ip_adapter_artist_path = hf_hub_download(\n",
47
+ " repo_id=\"AisingioroHao0/IP-Adapter-Artist\", filename=\"ip_adapter_artist_sdxl_512.pth\"\n",
48
+ ")"
49
+ ]
50
+ },
51
+ {
52
+ "attachments": {},
53
+ "cell_type": "markdown",
54
+ "metadata": {},
55
+ "source": [
56
+ "## Load Model"
57
+ ]
58
+ },
59
+ {
60
+ "cell_type": "code",
61
+ "execution_count": null,
62
+ "metadata": {},
63
+ "outputs": [],
64
+ "source": [
65
+ "csd_clip = torch.load(csd_clip_path).to(\"cuda\")\n",
66
+ "csd_clip.requires_grad_(False)\n",
67
+ "csd_clip = csd_clip.eval()"
68
+ ]
69
+ },
70
+ {
71
+ "cell_type": "code",
72
+ "execution_count": null,
73
+ "metadata": {},
74
+ "outputs": [],
75
+ "source": [
76
+ "pipe = StableDiffusionXLPipeline.from_pretrained(\n",
77
+ " \"stabilityai/stable-diffusion-xl-base-1.0\",\n",
78
+ " variant=\"fp16\",\n",
79
+ " torch_dtype=torch.float16,\n",
80
+ ").to(\"cuda\")"
81
+ ]
82
+ },
83
+ {
84
+ "cell_type": "code",
85
+ "execution_count": null,
86
+ "metadata": {},
87
+ "outputs": [],
88
+ "source": [
89
+ "image_processor = CLIPImageProcessor()"
90
+ ]
91
+ },
92
+ {
93
+ "cell_type": "code",
94
+ "execution_count": null,
95
+ "metadata": {},
96
+ "outputs": [],
97
+ "source": [
98
+ "load_ip_adapter(\n",
99
+ " pipe.unet,\n",
100
+ " ip_adapter_artist_path,\n",
101
+ ")"
102
+ ]
103
+ },
104
+ {
105
+ "cell_type": "code",
106
+ "execution_count": null,
107
+ "metadata": {},
108
+ "outputs": [],
109
+ "source": [
110
+ "scale = {\"up\": {\"block_0\": [0.0, 1.0, 0.0]}}\n",
111
+ "pipe.set_ip_adapter_scale(scale)"
112
+ ]
113
+ },
114
+ {
115
+ "attachments": {},
116
+ "cell_type": "markdown",
117
+ "metadata": {},
118
+ "source": [
119
+ "## Process Style Image"
120
+ ]
121
+ },
122
+ {
123
+ "cell_type": "code",
124
+ "execution_count": null,
125
+ "metadata": {},
126
+ "outputs": [],
127
+ "source": [
128
+ "image = load_image('https://github.com/aihao2000/IP-Adapter-Artist/blob/main/README.assets/example.jpg?raw=true')\n",
129
+ "image"
130
+ ]
131
+ },
132
+ {
133
+ "cell_type": "code",
134
+ "execution_count": null,
135
+ "metadata": {},
136
+ "outputs": [],
137
+ "source": [
138
+ "pixel_values = image_processor.preprocess(image, return_tensors=\"pt\").pixel_values\n",
139
+ "_, __, style_embeds = csd_clip(pixel_values.to(\"cuda\", torch.float32))\n",
140
+ "ip_adapter_image_embeds = torch.stack(\n",
141
+ " [torch.zeros_like(style_embeds).to(\"cuda\"), style_embeds]\n",
142
+ ").to(\"cuda\", torch.float16)"
143
+ ]
144
+ },
145
+ {
146
+ "attachments": {},
147
+ "cell_type": "markdown",
148
+ "metadata": {},
149
+ "source": [
150
+ "## Infer"
151
+ ]
152
+ },
153
+ {
154
+ "cell_type": "code",
155
+ "execution_count": null,
156
+ "metadata": {},
157
+ "outputs": [],
158
+ "source": [
159
+ "result = pipe(\n",
160
+ " ip_adapter_image_embeds=[ip_adapter_image_embeds],\n",
161
+ " prompt=\"A cat sitting on a table, top hat, best quality, masterpiece\",\n",
162
+ " negative_prompt=\"worst quality, low quality, low res, blurry, cropped image, jpeg artifacts, error, ugly, out of frame, deformed, poorly drawn\",\n",
163
+ " generator=torch.Generator(\"cuda\").manual_seed(42),\n",
164
+ " num_inference_steps=30,\n",
165
+ " guidance_scale=5.0,\n",
166
+ ").images[0]\n",
167
+ "result"
168
+ ]
169
+ },
170
+ {
171
+ "cell_type": "code",
172
+ "execution_count": null,
173
+ "metadata": {},
174
+ "outputs": [],
175
+ "source": [
176
+ "result = pipe(\n",
177
+ " ip_adapter_image_embeds=[ip_adapter_image_embeds],\n",
178
+ " prompt=\"A house covered with ice and snow.\",\n",
179
+ " negativ_prompt=\"multi view, worst quality, low quality, low res, blurry, cropped image, jpeg artifacts, error, ugly, out of frame, deformed, poorly drawn\",\n",
180
+ " generator=torch.Generator(\"cuda\").manual_seed(42),\n",
181
+ " num_inference_steps=30,\n",
182
+ " guidance_scale=5.0,\n",
183
+ ").images[0]\n",
184
+ "result"
185
+ ]
186
+ }
187
+ ],
188
+ "metadata": {
189
+ "kernelspec": {
190
+ "display_name": "torch",
191
+ "language": "python",
192
+ "name": "python3"
193
+ },
194
+ "language_info": {
195
+ "codemirror_mode": {
196
+ "name": "ipython",
197
+ "version": 3
198
+ },
199
+ "file_extension": ".py",
200
+ "mimetype": "text/x-python",
201
+ "name": "python",
202
+ "nbconvert_exporter": "python",
203
+ "pygments_lexer": "ipython3",
204
+ "version": "3.10.14"
205
+ },
206
+ "orig_nbformat": 4
207
+ },
208
+ "nbformat": 4,
209
+ "nbformat_minor": 2
210
+ }
setup.py ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from setuptools import find_packages, setup
2
+
3
+
4
+ setup(
5
+ name="ip_adapter_artist",
6
+ version="0.1",
7
+ description="Using reference images to control style in diffusion models",
8
+ long_description=open("README.md", "r", encoding="utf-8").read(),
9
+ long_description_content_type="text/markdown",
10
+ keywords="Using reference images to control style in diffusion models",
11
+ license="Apache",
12
+ author="aihao",
13
+ author_email="aihao2000@outlook.com",
14
+ url="https://github.com/aihao2000/IP-Adapter-Artist",
15
+ packages=find_packages(),
16
+ python_requires=">=3.8.0",
17
+ install_requires=[
18
+ "diffusers",
19
+ "transformers",
20
+ ],
21
+ classifiers=[
22
+ "Development Status :: 5 - Production/Stable",
23
+ "Intended Audience :: Developers",
24
+ "Intended Audience :: Education",
25
+ "Intended Audience :: Science/Research",
26
+ "License :: OSI Approved :: Apache Software License",
27
+ "Operating System :: OS Independent",
28
+ "Programming Language :: Python :: 3",
29
+ "Topic :: Scientific/Engineering :: Artificial Intelligence",
30
+ ],
31
+ )