cosmos3-dk1-cartesian / lora_config.json
andreaskoepf's picture
cosmos3-dk1-cartesian delta @ iter 10000 (gen-attn + action-IO full-FT + gen-MLP LoRA) + merge script + model card
685301c verified
{
"lora_rank": 16,
"lora_alpha": 32,
"scaling": 2.0,
"target_modules": ["mlp_moe_gen.gate_proj", "mlp_moe_gen.up_proj", "mlp_moe_gen.down_proj"],
"note": "Path-qualified targets: LoRA is on the GENERATION expert MLP only, not the frozen reasoner's mlp.*. Apply as W += scaling*(B@A), or inject LoRA with these r/alpha/targets and load the lora_* keys."
}