How to run multi-GPU inference on 2x NVIDIA T4?

#28
by Marshalldom - opened

Trying to split a model across two T4 GPUs but encountering issues. What is the best way to implement memory mapping or use Accelerate/vLLM for this setup?

Motif Technologies org

Please let us know if this works for you. In case you have already tried it, we would be glad to look into the error logs.
https://huggingface.co/docs/diffusers/main/en/training/distributed_inference

Sign up or log in to comment