执行Model card中的代码出错
执行出错:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 54.00 MiB. GPU 0 has a total capacty of 21.49 GiB of which 50.38 MiB is free. Including non-PyTorch memory, this process has 21.44 GiB memory in use. Of the allocated memory 21.14 GiB is allocated by PyTorch, and 152.80 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
电脑环境:6张GPU卡,每张卡22G显存,为了让他多张卡起作用,在原有代码中加入了参数:device_map="balanced"
model = AutoModelForCausalLM.from_pretrained(
"THUDM/glm-4v-9b",
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True,
device_map="balanced",
trust_remote_code=True
).to(device).eval()
帮忙看看是什么原因?
现在要单卡吧,多卡加载似乎还是有问题,我们会解决这个问题
已经解决
好的,找机会试试!
device map设定为auto 在github代码中的basic demo中有写到如何使用多卡加载
