Fix _init_rope compatibility with transformers >= 5.x rope_scaling standardization

#10

Problem

Starting from transformers 4.43+, AutoConfig automatically standardizes a missing or None rope_scaling to {"rope_type": "default", "factor": 1.0} during config initialization. MiniCPM-SALA's _init_rope() only handles None, "linear", "dynamic", and "longrope" β€” encountering "default" causes:

ValueError: Unknown RoPE scaling type default

This is triggered when loading MiniCPM-SALA via any library that depends on transformers >= 5.x (e.g. GPTQModel 7.0.0 pulls in transformers 5.8.0), even though the model does not use RoPE scaling (config.json has no rope_scaling field).

Changes

Modified both _init_rope() methods in modeling_minicpm_sala.py (MiniCPMInfLLMv2Attention at line 879 and MiniCPMSALAMLA at line 2144):

  • Treat rope_scaling of None or scaling_type in (None, "default") as no-scaling, falling through to the standard MiniCPMRotaryEmbedding.
  • Support both legacy "type" key and new "rope_type" key for cross-version compatibility.
  • Extract rope_scaling into a local variable to avoid repeated self.config.rope_scaling access.

Impact

  • No behavioral change: The model's config.json has no rope_scaling field; this fix only ensures the auto-filled "default" is correctly treated as no scaling.
  • Backward compatible: Works with both transformers 4.x (rope_scaling=None) and 5.x (rope_scaling={"rope_type": "default", ...}).
  • Verified: Tested with GPTQModel 7.0.0 + transformers 5.8.0 in the SOAR evaluation Docker image. Without this fix, GPTQModel.load() raises ValueError; with this fix, model loads successfully.
suhmily changed pull request status to merged

Sign up or log in to comment