Special pad token
#6
by
dk2502
- opened
Hi!
The special pad token is 32000. But the model vocab size is 32000. Doesn't this make the tokenizer vocab larger than the model vocab? By that, I mean the token 32000 can't be mapped to a vector by the embedding layer?
So if you printlen(tokenizer.get_vocab())
will give you 32001
But model.model.embed_tokens will give youEmbedding(32000, 4096, padding_idx=0)