tokenizer splitting words those are in vocab
#5
by one-thing - opened
Thank you for pointing this out. It seems that letting HF auto-identify the tokenizer type has some issues. Please use LlamaTokenizer instead of AutoTokenizer.
rahular changed discussion status to closed
