# TraceLift Code Reason RM Full Checkpoint This is the released code-domain TraceLift Reason RM. - Base initialization: `Qwen/Qwen2.5-7B-Instruct` - Training: LoRA, merged into full weights - LoRA rank: `32` - LoRA alpha: `64` - LoRA dropout: `0.05` - Rubric heads: five 5-way classification heads - Total head: one scalar regression head - Dimension loss: cross entropy - Total loss: Huber loss on the normalized total score The directory can be loaded directly with `reasonrm.modeling_reward.Qwen2ForReasonRewardModel.from_pretrained(...)`.