jdopensource
/

JoyAI-Image-Edit-Plus-Diffusers

JoyImageEditPlusPipeline

Model card Files Files and versions

JoyAI-Image-Edit-Plus-Diffusers / README.md

tangyanfei's picture

Upload README.md with huggingface_hub

690c1dd verified 1 day ago

|

History Blame Contribute Delete

3.58 kB

	---
	license: apache-2.0
	library_name: diffusers
	pipeline_tag: image-to-image
	tags:
	- image-editing
	- multi-image
	- diffusers
	- joyai
	base_model:
	- Qwen/Qwen3-VL-8B-Instruct
	---

	# JoyAI-Image Edit Plus

	JoyAI-Image Edit Plus is a multi-image instruction-guided editing model from the [JoyAI-Image](https://github.com/jd-opensource/JoyAI-Image) family. It accepts multiple reference images and a text instruction to generate a new image that combines elements from the references according to the instruction.

	## Model Architecture

	\| Component \| Model \| Size \|
	\|-----------\|-------\|------\|
	\| Text Encoder \| Qwen3-VL-8B-Instruct \| 8B \|
	\| Transformer (MMDiT) \| JoyImageEditPlusTransformer3DModel \| 16B \|
	\| VAE \| AutoencoderKLWan \| 240M \|
	\| Scheduler \| FlowMatchEulerDiscreteScheduler \| - \|

	## Installation

	`JoyImageEditPlusPipeline` has not yet been merged into the official diffusers release. Before it is available in a stable version, you need to install diffusers from the PR branch:

	```bash
	pip install git+https://github.com/tangyanf/diffusers.git@add-joyimage-edit-plus
	```

	If you have already installed diffusers, make sure to uninstall it first:

	```bash
	pip uninstall diffusers -y
	pip install git+https://github.com/tangyanf/diffusers.git@add-joyimage-edit-plus
	```

	Once the PR is merged into the official diffusers repository, you can switch back to the standard installation:

	```bash
	pip install diffusers --upgrade
	```

	## Usage

	```python
	import torch
	from PIL import Image
	from diffusers import JoyImageEditPlusPipeline

	pipe = JoyImageEditPlusPipeline.from_pretrained(
	"jdopensource/JoyAI-Image-Edit-Plus-Diffusers",
	torch_dtype=torch.bfloat16,
	).to("cuda")

	# Load reference images
	images = [
	Image.open("reference_0.png").convert("RGB"),
	Image.open("reference_1.png").convert("RGB"),
	]

	# Determine output resolution from the last reference image
	target_h, target_w = pipe._get_bucket_size(images[-1])

	# Generate
	result = pipe(
	images=images,
	prompt="Combine the person from the second image with the scene from the first image.",
	negative_prompt="low quality, blurry, deformed",
	height=target_h,
	width=target_w,
	num_inference_steps=30,
	guidance_scale=4.0,
	generator=torch.Generator(device="cuda").manual_seed(42),
	)
	result.images[0].save("output.png")
	```

	## Example

	Prompt: "The woman is lovingly holding the cute puppy in her arms"

	\| Input 0 \| Input 1 \| Output \|
	\|---------\|---------\|--------\|
	\| ![input_0](examples/input_0.png) \| ![input_1](examples/input_1.png) \| ![output](examples/output.png) \|

	## Recommended Parameters

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| `num_inference_steps` \| 30 \|
	\| `guidance_scale` \| 4.0 \|
	\| `torch_dtype` \| `torch.bfloat16` \|
	\| Resolution \| Auto-detected via `_get_bucket_size()` (1024-base buckets) \|

	## CLI Inference

	```bash
	python inference.py \
	--model_path jdopensource/JoyAI-Image-Edit-Plus-Diffusers \
	--images examples/input_0.png examples/input_1.png \
	--prompt "The woman is lovingly holding the cute puppy in her arms" \
	--num_inference_steps 30 \
	--guidance_scale 4.0 \
	--seed 42 \
	--output output.png
	```

	## Model Details

	- Developed by: JD.com
	- License: Apache-2.0
	- Diffusers version: >= 0.39.0
	- Framework: PyTorch

	## Citation

	```bibtex
	@misc{joyai-image-2025,
	title={JoyAI-Image: A Unified Multimodal Foundation Model for Image Understanding, Generation, and Editing},
	author={Joy Future Academy, JD},
	year={2025},
	url={https://github.com/jd-opensource/JoyAI-Image}
	}
	```