prithivMLmods's picture
Update README.md
b5d7d03 verified
|
raw
history blame
2.9 kB
metadata
license: other
license_name: qwen
license_link: https://huggingface.co/Qwen/Qwen2.5-Math-7B-PRM800K/blob/main/LICENSE
language:
  - en
  - zh
pipeline_tag: text-classification
library_name: transformers
tags:
  - reward model
base_model:
  - Qwen/Qwen2.5-Math-7B-PRM800K

PRM-Math-7B-Reasoner - Process Reward Model

PRM's : To identify and mitigate intermediate errors in the reasoning processes

PRM-Math-7B-Reasoner is a fully reproducible model, fine-tuned on the Qwen2.5-Math-7B-PRM800K dataset, designed to evaluate its ability to identify erroneous steps in mathematical reasoning. The model is used for reward computation, where after each step, a special token "" is inserted. For reward calculation, the probability score of this token being classified as positive is extracted, resulting in a reward value between 0 and 1. It is primarily utilized for solution reformatting in mathematically driven tasks and as a Long Context Full Reasoner.

PROCESSBENCH : PAPER

PROCESSBENCH: Identifying Process Errors in Mathematical Reasoning (arXiv) : https://arxiv.org/pdf/2412.06559

Reformatting Reasoning Intermediate

Section Content
Title Example of Solution Reformatting
Description This example demonstrates the reformatting of a solution for finding the foci of an ellipse. The original solution is generated by Qwen2-7B-Instruct, and the reformatted version is presented for clarity.
Problem Statement The ellipse (\frac{(x-6)^{2}}{25}+\frac{(y-3)^{2}}{9}=1) has two foci. Find the one with the larger x-coordinate. Enter your answer as an ordered pair, like ((2,1)).
Original Solution The original solution is provided in a less readable format, with some syntax errors and unclear notation.
Reformatted Solution The reformatted solution clearly explains the steps:
1. Identify the center of the ellipse ((6,3)).
2. Calculate the distance from the center to each focus using (\sqrt{a^2 - b^2}).
3. Determine the foci locations at ((2,3)) and ((10,3)).
4. Identify the focus with the larger x-coordinate as ((10,3)).