Fix tokenization_kimi.py bytes_to_unicode import for transformers v5+

#2
by Jin-2025 - opened

tokenization_kimi.py (loaded via trust_remote_code=True) imports bytes_to_unicode from transformers.models.gpt2.tokenization_gpt2, but in transformers >= 5.0 the helper was relocated to transformers.convert_slow_tokenizer and the GPT-2 module no longer re-exports it, so loading the Kimi tokenizer crashes during vLLM startup with:

ImportError: cannot import name 'bytes_to_unicode' from
'transformers.models.gpt2.tokenization_gpt2'

This blocks serving amd/Kimi-K2.5-W4A8 on any image shipping transformers v5+ (e.g. vllm/vllm-openai-rocm:v0.19.1 and its subsequent version).

Fix

One-line change to import bytes_to_unicode from transformers.convert_slow_tokenizer instead:

-from transformers.models.gpt2.tokenization_gpt2 import bytes_to_unicode
+from transformers.convert_slow_tokenizer import bytes_to_unicode

The function itself is byte-for-byte unchanged β€” only its public import path moved β€” and the new path also exists in transformers v4, so the fix is forward- and backward-compatible.

Verification

Running:

vllm serve /models/Kimi-K2.5-W4A8 --tensor-parallel-size 8 ...

on vllm/vllm-openai-rocm:v0.19.1 now loads the tokenizer successfully and startup proceeds to model loading.

bowenbaoamd changed pull request status to merged

Sign up or log in to comment