the vocab is mismatched with the original phi-2
#2
by Spico - opened
Thanks for sharing the model~
It seems the vocab size (50296) is smaller than the original phi-2 (51200). Are there special operations to drop tokens from the vocab ?
lxuechen changed discussion status to closed
Hi, thanks for the message. Yeah, the embedding size was reduced when I added the padding token. The original embedding size was made to be multiple of 64.