I tried it on an RTX 4090 with 24GB of VRAM, but it reported insufficient memory(OOM). After switching to a hybrid GPU-CPU execution mode, the final output became incorrect.
using the L20 with cuda 12.2, the model runs bad result in the demo code
· Sign up or log in to comment