Problem with multi-GPU

#22

by VityaVitalich - opened May 4, 2024

Dear maintainer,

I encounter the problem when setting device_map='auto'. The problem always arise with different device at some point of forward pass. I have tried the solution from this discussion, however it did not helped and the problem still holds (https://discuss.huggingface.co/t/runtimeerror-expected-all-tensors-to-be-on-the-same-device-but-found-at-least-two-devices-cuda-1-and-cuda-0/39548/13). In my case it falls at the point of applying LayerNorm, despite the fact that both layernorm parameters and inputs are on the same device, that is strange.

Please consider fixing this bug, otherwise infering or fine-tuning big amount of data remains quite a problem

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment