ClinicalRewardModel-Qwen2_5-7B

This is a 5-head reward model fine-tuned on clinical decision-making data using the Qwen2.5-7B-Instruct backbone. It is trained to evaluate clinical vignette–based multiple-choice questions across five expert-defined clinical criteria:

Clinical Plausibility
Clinical Utility
Quality of Decision Path
Alignment to Decision Path
Correctness of the Suggested Answer

Each head independently scores one dimension on a 1–5 Likert scale. This model supports fine-grained quality filtering for guideline-based QA datasets such as MedGUIDE-MCQA-8K.

Downloads last month: 2

Safetensors

Model size

7B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support