Using model on H100

#38
by Jeosss - opened

Hello,
I am trying to use the model on google Colab, where the biggest machine I am provided with is a H100 (230Gb RAM 80 Gb GPU).
The model can -barely- fit and run on RAM alone but it's too slow, so I tried moving it to_cuda() but then I get a CUDA out of memory Error.

I am not an expert on this but is there a way to have the model run in part on RAM and in part on GPU? So to make it faster than RAM alone.
Has anyone managed to do it on Colab or on a H100?

Sign up or log in to comment