Fix _init_rope compatibility with transformers >= 5.x rope_scaling standardization
#10
by DennisHuang648 - opened
Problem
Starting from transformers 4.43+, AutoConfig automatically standardizes a missing or None rope_scaling to {"rope_type": "default", "factor": 1.0} during config initialization. MiniCPM-SALA's _init_rope() only handles None, "linear", "dynamic", and "longrope" β encountering "default" causes:
ValueError: Unknown RoPE scaling type default
This is triggered when loading MiniCPM-SALA via any library that depends on transformers >= 5.x (e.g. GPTQModel 7.0.0 pulls in transformers 5.8.0), even though the model does not use RoPE scaling (config.json has no rope_scaling field).
Changes
Modified both _init_rope() methods in modeling_minicpm_sala.py (MiniCPMInfLLMv2Attention at line 879 and MiniCPMSALAMLA at line 2144):
- Treat
rope_scalingofNoneorscaling_type in (None, "default")as no-scaling, falling through to the standardMiniCPMRotaryEmbedding. - Support both legacy
"type"key and new"rope_type"key for cross-version compatibility. - Extract
rope_scalinginto a local variable to avoid repeatedself.config.rope_scalingaccess.
Impact
- No behavioral change: The model's
config.jsonhas norope_scalingfield; this fix only ensures the auto-filled"default"is correctly treated as no scaling. - Backward compatible: Works with both transformers 4.x (
rope_scaling=None) and 5.x (rope_scaling={"rope_type": "default", ...}). - Verified: Tested with GPTQModel 7.0.0 + transformers 5.8.0 in the SOAR evaluation Docker image. Without this fix,
GPTQModel.load()raisesValueError; with this fix, model loads successfully.
suhmily changed pull request status to merged