Translation
Safetensors
mistral

Cannot load model due to Tokenizer issues.

#8
by Fhrozen - opened

Hello there.
Thank you for providing this excellent model.

I am currently facing some issues with the transformers==4.55.0 and mistral_commons==1.8.3

It seems that this model is not supported by mistral_commons (Needed for other models).
When I tried to load this model, I got the error:

File "/workspaces/venv/lib/python3.12/site-packages/transformers/tokenization_mistral_common.py", line 1767, in from_pretrained
    tokenizer_path = download_tokenizer_from_hf_hub(
File "/workspaces/venv/lib/python3.12/site-packages/mistral_common/tokens/tokenizers/utils.py", line 159, in download_tokenizer_from_hf_hub
ValueError: No tokenizer file found for model ID: ByteDance-Seed/Seed-X-PPO-7B

And this is because mistral-commons only supports sentencepiece tokenizers ("*.model") or tekken tokenizers

https://github.com/mistralai/mistral-common/blob/d6d380a7fdeab2456c22400bfdc81c5210a78313/src/mistral_common/tokens/tokenizers/utils.py#L74-L91

Are you planning to update the models with the files (tekken.json), or no current fix in the meantime?

Best.

Any updates or solution?

I also encountered a problem, is there any solution?

Use the example code in quick_start section of README.md

Sign up or log in to comment