Instructions to use Qwen/Qwen-Image with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Qwen/Qwen-Image with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
so how to input multi-images? (Just python code without ComfyUI)
image1 = Image.open("./1.png").convert("RGB").resize((512, 512))
image2 = Image.open("./2.png").convert("RGB").resize((512, 512))
inputs = {
"image": [image1, image2],
"prompt": prompt,
"generator": torch.manual_seed(0),
"true_cfg_scale": 4.0,
"negative_prompt": "",
"num_inference_steps": 8,
}
I tried to pass multiple images as a list into the pipeline, but unfortunately, it does not work. The error is as follows:
"/home/zyy/miniconda3/envs/py311/lib/python3.11/site-packages/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py", line 1220, in get_placeholder_mask raise ValueError( ValueError: Image features and image tokens do not match: tokens: 324, features: 648
image_paths = ["./1.png", "./2.png"]
prompt = "Wushu stance"
for image_path in image_paths:
image = Image.open(image_path).convert("RGB").resize((512, 512))
inputs = {
"image": image,
"prompt": prompt,
"generator": torch.manual_seed(0),
"true_cfg_scale": 4.0,
"negative_prompt": "",
"num_inference_steps": 8,
}