Unable to load Bloom on an EC2 instance

#99

by viniciusguimaraes - opened Aug 31, 2022

Aug 31, 2022

Hi everyone. I am trying to load Bloom-175B on a x2iezn.6xlarge (specs below) but it is stuck on BloomForCausalLM.from_pretrained() call. I was able to narrow down the exact method where the code stops by using faulthandler's dump_traceback_later method (attached image) but I'm still trying to understand why it happens. The line in Pytorch where it seems to have a problem is

storage = zip_file.get_storage_from_record(name, numel, torch._UntypedStorage).storage()._untyped()

Has anyone had a similar problem and was able to solve it?

x2iezn.6xlarge specs
768gb RAM
24 vcpus

arteagac

Aug 31, 2022

•

edited Aug 31, 2022

Hi @viniciusguimaraes . You could alternatively try downloading the model first and then using it from the downloaded folder as follows:

Download model:

git lfs install
git clone https://huggingface.co/bigscience/bloom

Use model:

model = AutoModel.from_pretrained("<your_downloaded_folder>/bloom")

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment