Update MiMo-V2-Flash config.json for native Transformers compatibility

#41
by casinca - opened

Hello Xiaomi MiMo team,

I am currently working on a PR to add the MiMo-V2-Flash model to the Transformers library, and I’ve been asked whether it would be possible to add a few entries to your config.json file, so we can have a single unified config.json file that aligns native Transformers hyperparamters conventions with your current config.json file.

So this PR updates your config.json to include native Transformers config entries while keeping full backward compatibility with your current remote implementation (single merged config.json, no model behavior change).

Added entries

  • rms_norm_eps (native alias of layernorm_epsilon)
  • layer_types (native alias of hybrid_layer_pattern)
  • mlp_layer_types (native alias of moe_layer_freq)
  • rope_parameters (native RoPE structure for full/sliding attention)

Changed entries (non-breaking changes)

  • routed_scaling_factor: null -> 1.0

 

All legacy fields used by modeling_mimo_v2_flash.py are preserved, so this remains compatible with existing loading paths.

 

Tagging @AntonV from the Hugging Face team who also worked on it and is aware of this alignment request.

Thanks! Kind regards.

Thanks a lot!! This would be very nice to merge 🤗

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment