YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Released TraceLift Reason RMs
This directory contains two ready-to-load full Reward Model checkpoints:
code-rm-full-ce: code-domain Reason RM.math-rm-full-ce: math-domain Reason RM.
Both checkpoints were initialized from Qwen/Qwen2.5-7B-Instruct, trained with LoRA, and then merged into full Qwen2ForReasonRewardModel weights.
Training details:
- LoRA rank
32, alpha64, dropout0.05. - Five rubric classification heads with CE dimension loss.
- One total-score head with Huber loss on the normalized total score.
- The released checkpoints already include the backbone, rubric heads, and total head.
Load them directly with reasonrm.modeling_reward.Qwen2ForReasonRewardModel.from_pretrained(...).
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support