How much disk does each of the bloom models require?

#50

by dgaff - opened Jul 18, 2022

Hey all - I'm looking for a listing of the new models by disk usage and I can't seem to find that - is there anywhere where I can find that?

BigScience Workshop org Jul 18, 2022

•

A good rule of thumb for autoregressive transformers is 13x the number of parameters for training and 2x the number of parameters for inference.

•

It needed around 400GB just to fit the all the weights files. They list the sizes of the weights and checkpoints under the Training section.

BigScience Workshop org Nov 15, 2022

Closing as this seem resolved.

TimeRobber changed discussion status to closed Nov 15, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment