vocab size wrong in facebook/bart-large-cnn

#101
by silverbeats - opened

I met the following error.

/pytorch/aten/src/ATen/native/cuda/IndexKernelUtils.cu:16: vectorized_gather_kernel: block: [726,0,0], thread: [223,0,0] Assertion `ind >=0 && ind < ind_dim_size && "vectorized gather kernel index out of bounds"`failed.`

During debug, something goes wrong at change input_ids to embedding. Finally, I found that the vocab.json has 50265 tokens, but the vocab_size in config.json is 50264.

Sign up or log in to comment