|
|
--- |
|
|
base_model: |
|
|
- Qwen/Qwen-Image-Edit-2511 |
|
|
frameworks: |
|
|
- Pytorch |
|
|
license: Apache License 2.0 |
|
|
tags: [] |
|
|
tasks: |
|
|
- text-to-image-synthesis |
|
|
--- |
|
|
# In-Context Editing LoRA (Qwen-Image-Edit-2511) |
|
|
|
|
|
## 模型介绍 |
|
|
|
|
|
这是一个很有趣的模型,能够为 [Qwen-Image-Edit-2511](https://modelscope.cn/models/Qwen/Qwen-Image-Edit-2511) 提供 In-Context Editing 能力。你可以为模型输入三张图:图1,图2,图3,模型会自动将图1到图2的变化应用到图3。 |
|
|
|
|
|
|
|
|
更多关于训练策略和实现细节,欢迎查看我们的[技术博客](https://modelscope.cn/learn/4826)。 |
|
|
|
|
|
## 效果展示 |
|
|
|
|
|
图片素材来源: |
|
|
|
|
|
* 输入图 1:https://modelscope.cn/aigc/imageGeneration?tab=advanced&imageId=18968195 |
|
|
* 输入图 2:由编辑模型执行单图编辑生成 |
|
|
* 输入图 3:https://modelscope.cn/aigc/imageGeneration?tab=advanced&imageId=18723032 |
|
|
|
|
|
提示词:`Edit image 3 based on the transformation from image 1 to image 2.` |
|
|
|
|
|
负向提示词:`泛黄,AI感,不真实,丑陋,油腻的皮肤,异常的肢体,不协调的肢体` |
|
|
|
|
|
* 样例 1:表情参考 |
|
|
|
|
|
|输入图1|输入图2|输入图3|输出图| |
|
|
|-|-|-|-| |
|
|
||||| |
|
|
|
|
|
* 样例 2:风格迁移 |
|
|
|
|
|
|输入图1|输入图2|输入图3|输出图| |
|
|
|-|-|-|-| |
|
|
||||| |
|
|
|
|
|
* 样例 3:增加实体 |
|
|
|
|
|
|输入图1|输入图2|输入图3|输出图| |
|
|
|-|-|-|-| |
|
|
||||| |
|
|
|
|
|
* 样例 4:局部编辑 |
|
|
|
|
|
|输入图1|输入图2|输入图3|输出图| |
|
|
|-|-|-|-| |
|
|
||||| |
|
|
|
|
|
## 推理代码 |
|
|
|
|
|
安装 [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio): |
|
|
|
|
|
```shell |
|
|
git clone https://github.com/modelscope/DiffSynth-Studio.git |
|
|
cd DiffSynth-Studio |
|
|
pip install -e . |
|
|
``` |
|
|
|
|
|
推理代码: |
|
|
|
|
|
```python |
|
|
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig |
|
|
from modelscope import snapshot_download |
|
|
from PIL import Image |
|
|
import torch |
|
|
|
|
|
# Load models |
|
|
pipe = QwenImagePipeline.from_pretrained( |
|
|
torch_dtype=torch.bfloat16, |
|
|
device="cuda", |
|
|
model_configs=[ |
|
|
ModelConfig(model_id="Qwen/Qwen-Image-Edit-2511", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"), |
|
|
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"), |
|
|
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"), |
|
|
], |
|
|
processor_config=ModelConfig(model_id="Qwen/Qwen-Image-Edit", origin_file_pattern="processor/"), |
|
|
) |
|
|
lora = ModelConfig( |
|
|
model_id="DiffSynth-Studio/Qwen-Image-Edit-2511-ICEdit-LoRA", |
|
|
origin_file_pattern="model.safetensors" |
|
|
) |
|
|
pipe.load_lora(pipe.dit, lora) |
|
|
|
|
|
# Load images |
|
|
snapshot_download( |
|
|
"DiffSynth-Studio/Qwen-Image-Edit-2511-ICEdit-LoRA", |
|
|
local_dir="./data", |
|
|
allow_file_pattern="assets/*" |
|
|
) |
|
|
edit_image = [ |
|
|
Image.open("data/assets/image1_original.png"), |
|
|
Image.open("data/assets/image1_edit_1.png"), |
|
|
Image.open("data/assets/image2_original.png") |
|
|
] |
|
|
prompt = "Edit image 3 based on the transformation from image 1 to image 2." |
|
|
negative_prompt = "泛黄,AI感,不真实,丑陋,油腻的皮肤,异常的肢体,不协调的肢体" |
|
|
|
|
|
# Generate |
|
|
image_4 = pipe( |
|
|
prompt=prompt, negative_prompt=negative_prompt, |
|
|
edit_image=edit_image, |
|
|
seed=1, |
|
|
num_inference_steps=50, |
|
|
height=1280, |
|
|
width=720, |
|
|
zero_cond_t=True, |
|
|
) |
|
|
image_4.save("image.png") |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|