SPARE-PRM - a UKPLab Collection

UKPLab 's Collections

updated Dec 18, 2025

Process Reward Models (PRMs) trained using Single-Pass Annotation with Reference-Guided Evaluation (SPARE) methodology proposed in our AAAI-2026 paper

Upvote

UKPLab/Qwen2.5-3b-spare-prm-math

Text Generation • 3B • Updated Mar 31 • 9
UKPLab/Llama-3-8b-spare-prm-math

Text Generation • 8B • Updated Mar 31 • 6

Upvote

Collection guide
Browse collections