ovedrive
/

Qwen-Image-Edit-2509-4bit

QwenImageEditPlusPipeline

Model card Files Files and versions

ovedrive commited on Sep 23, 2025

Commit

e5f81ff

·

verified ·

1 Parent(s): 863a8ef

Update README.md

Files changed (1) hide show

README.md +62 -0

README.md CHANGED Viewed

@@ -5,6 +5,68 @@ language:
 - zh
 library_name: diffusers
 pipeline_tag: image-to-image
 ---
 <p align="center">
     <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/qwen_image_edit_logo.png" width="400"/>

 - zh
 library_name: diffusers
 pipeline_tag: image-to-image
+quantized_by: abhishekdujari
+base_model:
+- Qwen/Qwen-Image-Edit-2509
+base_model_relation: quantized
+---
+This is an NF4 quantized model of Qwen-image-edit-2509 so it can run on GPUs using 20GB VRAM. You can run it on lower VRAM like 16GB.
+There were other NF4 models but they made the mistake of blindly quantizing all layers in the transformer.
+This one does not. We retain some layers at full precision in order to ensure that we get quality output.
+You can use the original Qwen-Image-Edit parameters.
+This model is `not yet` available for inference at JustLab.ai
+Model tested: Working perfectly even with 10 steps.
+Contact: [JustLab.ai](https://justlab.ai) for commercial support
+### Performance on rtx4090
+- 20 steps about 78 seconds.
+- 10 steps about 40 seconds.
+Interestingly I was under the impression that the Qwen-VL could not be quantized which is why several projects use the full 15Gb model.
+Here I have quantized it too and it seems to be workign fine.
+Sample script. (min 20GB VRAM)
+```python
+import os
+from PIL import Image
+import torch
+from diffusers import QwenImageEditPlusPipeline
+model_path = "ovedrive/Qwen-Image-Edit-2509-4bit"
+pipeline = QwenImageEditPlusPipeline.from_pretrained(model_path, torch_dtype=torch.bfloat16)
+print("pipeline loaded") # not true but whatever. do not move to cuda
+pipeline.set_progress_bar_config(disable=None)
+pipeline.enable_model_cpu_offload() #if you have enough VRAM replace this line with `pipeline.to("cuda")` which is 20GB VRAM
+image = Image.open("./example.png").convert("RGB")
+prompt = "Remove the lady head with white hair"
+inputs = {
+    "image": image,
+    "prompt": prompt,
+    "generator": torch.manual_seed(0),
+    "true_cfg_scale": 4.0,
+    "negative_prompt": " ",
+    "num_inference_steps": 20, # even 10 steps should be enough in many cases
+}
+with torch.inference_mode():
+  output = pipeline(**inputs)
+output_image = output.images[0]
+output_image.save("output_image_edit.png")
+print("image saved at", os.path.abspath("output_image_edit.png"))
+```
+The original Qwen-Image-Edit-2509 attributions are included verbatim below.
 ---
 <p align="center">
     <img src="https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/qwen_image_edit_logo.png" width="400"/>