How to run multi-GPU inference on 2x NVIDIA T4?

#28

by Marshalldom - opened May 7

May 7

Trying to split a model across two T4 GPUs but encountering issues. What is the best way to implement memory mapping or use Accelerate/vLLM for this setup?

kencwt

Motif Technologies org May 15

Please let us know if this works for you. In case you have already tried it, we would be glad to look into the error logs.
https://huggingface.co/docs/diffusers/main/en/training/distributed_inference

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment