Update MiMo-V2-Flash config.json for native Transformers compatibility
Hello Xiaomi MiMo team,
I am currently working on a PR to add the MiMo-V2-Flash model to the Transformers library, and I’ve been asked whether it would be possible to add a few entries to your config.json file, so we can have a single unified config.json file that aligns native Transformers hyperparamters conventions with your current config.json file.
So this PR updates your config.json to include native Transformers config entries while keeping full backward compatibility with your current remote implementation (single merged config.json, no model behavior change).
Added entries
rms_norm_eps(native alias oflayernorm_epsilon)layer_types(native alias ofhybrid_layer_pattern)mlp_layer_types(native alias ofmoe_layer_freq)rope_parameters(native RoPE structure for full/sliding attention)
Changed entries (non-breaking changes)
routed_scaling_factor:null->1.0
All legacy fields used by
modeling_mimo_v2_flash.pyare preserved, so this remains compatible with existing loading paths.
Tagging @AntonV from the Hugging Face team who also worked on it and is aware of this alignment request.
Thanks! Kind regards.
Thanks a lot!! This would be very nice to merge 🤗