[solved] CUDA OOM with RTX 5060 Ti 16G
#5
by
foxedge
- opened
Hi, the model card page mentions this model can be run on 16G VRAM.
However, I tested the demo code with a RTX 5060 Ti 16G, and I get a CUDA OOM.
I am using :
diffusers 0.35.2
pytorch 2.7.1
bitsandbytes 0.48.2
transformers 4.57.1
The output shows:
You are loading your model in 8bit or 4bit but no linear modules were found in your model. Please double check your model architecture, or submit an issue on github if you think this is a bug.
The config attributes {'pooled_projection_dim': 768} were passed to QwenImageTransformer2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.
OutOfMemoryError Traceback (most recent call last)
Cell In[1], line 15
12 torch_dtype = torch.float32
13 device = "cpu"
---> 15 pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
16 pipe = pipe.to(device)
18 positive_magic = {
19 "en": ", Ultra HD, 4K, cinematic composition.", # for english prompt
20 "zh": ", 超清,4K,电影级构图." # for chinese prompt
21 }
[...]
Any suggestion to fix this? I am not sure if this is a setup issue or if this model just needs more VRAM. 😅
Thanks!
you can look at his example https://huggingface.co/ovedrive/qwen-image-edit-4bit/discussions/4#68ae6605af245e5fd682489c
it worked with 16GB. It's just about picking what to offload. 16GB is cutting it close without then getting into blockswapping for lower VRAM.
foxedge
changed discussion title from
CUDA OOM with RTX 5060 Ti 16G
to [solved] CUDA OOM with RTX 5060 Ti 16G
fantastic. I am glad it helped.