Special pad token

by dk2502 - opened Nov 19, 2025

Nov 19, 2025

Hi!

The special pad token is 32000. But the model vocab size is 32000. Doesn't this make the tokenizer vocab larger than the model vocab? By that, I mean the token 32000 can't be mapped to a vector by the embedding layer?

dk2502

Nov 19, 2025

So if you print
len(tokenizer.get_vocab())
will give you 32001
But model.model.embed_tokens will give you
Embedding(32000, 4096, padding_idx=0)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment