about pad_token_id

by ToughStone - opened Jun 1, 2023

Jun 1, 2023

I got an error in loading the model:
size mismatch for model.decoder.embed_positions.weight: copying a param with shape torch.Size([1026, 768]) from checkpoint, the shape in current model is torch.Size([1025, 768]).
When creating the position embedded layer, the dimension is set to 1024+pad_ token_ id+1. In chinese vocabulary, pad_ token_ id=0, while in english it is 1. Where the problem is?

Jun 10, 2023

How did you load the model and tokenizer? These should be both loaded from bart-base-chinese.

yf changed discussion status to closed Sep 9, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment