Tokenizer cannot recognise unused tokens in its vocab

#13
by ArvinZhuang - opened

image

Sometimes we can leverage unused tokens for some tasks.
But this can be fixed by self.tokenizer.add_special_tokens({"additional_special_tokens": ["[unused39]"]})

ArvinZhuang changed discussion title from Tokenizer cannot recognise unused tokens. to Tokenizer cannot recognise unused tokens in its vocab

Sign up or log in to comment