AttributeError when using tf v5
I am using vLLM 0.16.0+cu130.
When I run uv pip install --upgrade transformers, it installs Transformers 5.2.0, but this causes an error:
AttributeError: cachedmistralcommonbackend has no attribute is_fast.
However, it works fine with Transformers 4.57.6, although there are some warnings.
It seems that the maximum version of transformers compatible with vLLM is 4.9.
vllm 0.16.0 officially released,so which version of transformers should i use
Hey thanks for reaching out, can you give the code snippet you use to spin up the model as well as the error trace ?
Hey thanks for reaching out, can you give the code snippet you use to spin up the model as well as the error trace ?
I’m using the latest vLLM (0.16.0+cu130) and Transformers (5.2.0), and I’m launching the model with the same command provided in the model card.
detail: https://gist.github.com/ariable/a58d87605a8121b0151054e85493f089
Thank you very much.
I just tested using the latest vllm docker image and installing latest transformers:
vllm 0.16.0
transformers 5.2.0
VLLM_DISABLE_COMPILE_CACHE=1 vllm serve mistralai/Voxtral-Mini-4B-Realtime-2602 --compilation_config '{"cudagraph_mode": "PIECEWISE"}'
I don't have any issues to serve and using the referenced examples in the model card. Are you sure to follow exactly the provided snippets ?
I just tested using the latest vllm docker image and installing latest transformers:
vllm 0.16.0 transformers 5.2.0VLLM_DISABLE_COMPILE_CACHE=1 vllm serve mistralai/Voxtral-Mini-4B-Realtime-2602 --compilation_config '{"cudagraph_mode": "PIECEWISE"}'I don't have any issues to serve and using the referenced examples in the model card. Are you sure to follow exactly the provided snippets ?
It now works with vLLM 0.16.1rc and Transformers 5.3.0dev on CUDA 12.8. I will test it with CUDA 13 when I have time.