Issue with T2V (it is not issue, wrong workwlow)
I testing t2v and catch this issue. I have 3 gpus and patch Comfy in model_management.py to switch at cuda2 (5090) this code fragment "torch.cuda.set_device(torch.cuda.device_count()-1); return torch.device(torch.cuda.current_device())". For qwen i have no issue. Are you override the device in loader for wan model?
my config:
Total VRAM 32109 MB, total RAM 156301 MB
pytorch version: 2.9.1+cu130
Set vram state to: NORMAL_VRAM
Device: cuda:2 NVIDIA GeForce RTX 5090 : cudaMallocAsync
Using async weight offloading with 2 streams
Enabled pinned memory 148485.0
working around nvidia conv3d memory bug.
Using sage attention
Python version: 3.13.9 | packaged by Anaconda, Inc. | (main, Oct 21 2025, 19:16:10) [GCC 11.2.0]
ComfyUI version: 0.7.0
ComfyUI frontend version: 1.35.9
torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_function <built-in method _int_mm of type object at 0x748b709dcaa0>((FakeTensor(..., device='cuda:2', size=(s27s77, 5120), dtype=torch.int8), FakeTensor(..., size=(5120, 5120), dtype=torch.int8)), **{}): got RuntimeError('Unhandled FakeTensor Device Propagation for aten._int_mm.default, found two different devices cuda:2, cpu')
from user code:
File "/home/ceasar/miniconda3/lib/python3.13/site-packages/hqqsvd/linear.py", line 137, in _forward
return self.forward_int8(x)
File "/home/ceasar/miniconda3/lib/python3.13/site-packages/hqqsvd/linear.py", line 131, in forward_int8
return (torch._int_mm(x_q, W_q).to(dtype) * scale_x * scale_w).unsqueeze(
can't remove topic.... i used wrong EmptyLatent node. With "Empty HunyuanVideo 1.0 Latent" issue disapeares.
All right