How to use unused token? (UNUSED_0, UNUSED_1, etc.)
#6
by RifqiAnshariR - opened
Hi. I have a question about unused token in inobert-base-p1. I want to fine tune the model with adding some "new" special token. Should i assign my new vocab to [UNUSED_X] token? Why is the [UNUSED_X] token turns into multiple sub-tokens when i do:
encoded = tokenizer_p1.encode("[UNUSED_0]")
encoded
Is this actually a reserved unused token in the model vocab?