Instructions to use jdopensource/JoyAI-Image-Edit-Plus-Diffusers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use jdopensource/JoyAI-Image-Edit-Plus-Diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("jdopensource/JoyAI-Image-Edit-Plus-Diffusers", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
File size: 3,584 Bytes
a5953a4 63e10a1 a5953a4 63e10a1 e3aa829 63e10a1 690c1dd 63e10a1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | ---
license: apache-2.0
library_name: diffusers
pipeline_tag: image-to-image
tags:
- image-editing
- multi-image
- diffusers
- joyai
base_model:
- Qwen/Qwen3-VL-8B-Instruct
---
# JoyAI-Image Edit Plus
JoyAI-Image Edit Plus is a multi-image instruction-guided editing model from the [JoyAI-Image](https://github.com/jd-opensource/JoyAI-Image) family. It accepts **multiple reference images** and a text instruction to generate a new image that combines elements from the references according to the instruction.
## Model Architecture
| Component | Model | Size |
|-----------|-------|------|
| Text Encoder | Qwen3-VL-8B-Instruct | 8B |
| Transformer (MMDiT) | JoyImageEditPlusTransformer3DModel | 16B |
| VAE | AutoencoderKLWan | 240M |
| Scheduler | FlowMatchEulerDiscreteScheduler | - |
## Installation
`JoyImageEditPlusPipeline` has not yet been merged into the official diffusers release. Before it is available in a stable version, you need to install diffusers from the PR branch:
```bash
pip install git+https://github.com/tangyanf/diffusers.git@add-joyimage-edit-plus
```
If you have already installed diffusers, make sure to uninstall it first:
```bash
pip uninstall diffusers -y
pip install git+https://github.com/tangyanf/diffusers.git@add-joyimage-edit-plus
```
Once the PR is merged into the official diffusers repository, you can switch back to the standard installation:
```bash
pip install diffusers --upgrade
```
## Usage
```python
import torch
from PIL import Image
from diffusers import JoyImageEditPlusPipeline
pipe = JoyImageEditPlusPipeline.from_pretrained(
"jdopensource/JoyAI-Image-Edit-Plus-Diffusers",
torch_dtype=torch.bfloat16,
).to("cuda")
# Load reference images
images = [
Image.open("reference_0.png").convert("RGB"),
Image.open("reference_1.png").convert("RGB"),
]
# Determine output resolution from the last reference image
target_h, target_w = pipe._get_bucket_size(images[-1])
# Generate
result = pipe(
images=images,
prompt="Combine the person from the second image with the scene from the first image.",
negative_prompt="low quality, blurry, deformed",
height=target_h,
width=target_w,
num_inference_steps=30,
guidance_scale=4.0,
generator=torch.Generator(device="cuda").manual_seed(42),
)
result.images[0].save("output.png")
```
## Example
**Prompt:** "The woman is lovingly holding the cute puppy in her arms"
| Input 0 | Input 1 | Output |
|---------|---------|--------|
|  |  |  |
## Recommended Parameters
| Parameter | Value |
|-----------|-------|
| `num_inference_steps` | 30 |
| `guidance_scale` | 4.0 |
| `torch_dtype` | `torch.bfloat16` |
| Resolution | Auto-detected via `_get_bucket_size()` (1024-base buckets) |
## CLI Inference
```bash
python inference.py \
--model_path jdopensource/JoyAI-Image-Edit-Plus-Diffusers \
--images examples/input_0.png examples/input_1.png \
--prompt "The woman is lovingly holding the cute puppy in her arms" \
--num_inference_steps 30 \
--guidance_scale 4.0 \
--seed 42 \
--output output.png
```
## Model Details
- **Developed by**: JD.com
- **License**: Apache-2.0
- **Diffusers version**: >= 0.39.0
- **Framework**: PyTorch
## Citation
```bibtex
@misc{joyai-image-2025,
title={JoyAI-Image: A Unified Multimodal Foundation Model for Image Understanding, Generation, and Editing},
author={Joy Future Academy, JD},
year={2025},
url={https://github.com/jd-opensource/JoyAI-Image}
}
```
|