TowerInstruct takes twice as much space as TowerBase

by bpop - opened Jan 22, 2024

Jan 22, 2024

Hello all,

I've been experimenting with both TowerBase and TowerInstruct. When I load them in python, they both have the expected number of parameters (6.7 billion, give or take). However, they take up drastically different amounts of space in my .cache. TowerInstruct is 26G, while TowerBase is only 13G. I imagine this is because TowerBase's weights are stored as bf16, whereas TowerInstruct's are fp32. But...was this on purpose?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment