the result is a all black image ?

by zet-yd - opened Aug 22, 2025

Aug 22, 2025

here is my infer code , is there has any wrong?

import os
from PIL import Image
import torch

from diffusers import QwenImageEditPipeline

pipeline = QwenImageEditPipeline.from_pretrained("ovedrive/qwen-image-edit-4bit")
print("pipeline loaded")
pipeline.to(torch.bfloat16)
pipeline.to("cuda")
pipeline.set_progress_bar_config(disable=None)
image = Image.open("./input/0.png").convert("RGB")
prompt = "Add a small wooden sign in the foreground in front of thepenguins with the text 'Welcome to Penquin Beach'."
inputs = {
"image": image,
"prompt": prompt,
"generator": torch.manual_seed(0),
"true_cfg_scale": 4.0,
"negative_prompt": " ",
"num_inference_steps": 50,
}

with torch.inference_mode():
output = pipeline(**inputs)
output_image = output.images[0]
output_image.save("output_image_edit.png")
print("image saved at", os.path.abspath("output_image_edit.png"))

ovedrive

Owner Aug 22, 2025

You shouldnt use the vae from my repo. you need to use the original VAE. Only transformer and text_encoder are important. the rest you can use from the original Qwen-Image-Edit repo.
If that is something you can try and let me know that'd be great. I am not on my GPU Dev machine so I will post a working example in a day when i am back on my desk.

ovedrive

Owner Aug 22, 2025

Btw check if you are using vae tiling, that causes issues with qwen . Don’t use it.

ovedrive

Owner Aug 23, 2025

check updated sample code in the model card.

ovedrive changed discussion status to closed Aug 23, 2025

zet-yd

Aug 25, 2025

Thanks, I'll try it today！

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment