Mismatch in lm_head.weight and model.embed_tokens.weight Layer Sizes
#2
by Danivilanova - opened
I am experiencing an issue with the size of the lm_head.weight and the model.embed_tokens.weight layers in the model. According to the vocab_size, both of these layers should have a size of torch.Size([32001, 8192]). However, when I load the checkpoint, I observe that these layers have a size of torch.Size([32000, 8192]).
Has anyone else encountered this issue, or does anyone have any insights or suggestions on how to resolve it? Any help would be greatly appreciated!