prithivMLmods commited on
Commit
12ea0a7
·
verified ·
1 Parent(s): 7a3bceb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -18,3 +18,12 @@ base_model:
18
 
19
  PRM-Math-7B-Reasoner is a fully reproducible model, fine-tuned on the Qwen2.5-Math-7B-PRM800K dataset, designed to evaluate its ability to identify erroneous steps in mathematical reasoning. The model is used for reward computation, where after each step, a special token "<extra_0>" is inserted. For reward calculation, the probability score of this token being classified as positive is extracted, resulting in a reward value between 0 and 1. It is primarily utilized for solution reformatting in mathematically driven tasks and as a Long Context Full Reasoner.
20
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  PRM-Math-7B-Reasoner is a fully reproducible model, fine-tuned on the Qwen2.5-Math-7B-PRM800K dataset, designed to evaluate its ability to identify erroneous steps in mathematical reasoning. The model is used for reward computation, where after each step, a special token "<extra_0>" is inserted. For reward calculation, the probability score of this token being classified as positive is extracted, resulting in a reward value between 0 and 1. It is primarily utilized for solution reformatting in mathematically driven tasks and as a Long Context Full Reasoner.
20
 
21
+ # Reformatting Reasoning Intermediate
22
+
23
+ | **Section** | **Content** |
24
+ |-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
25
+ | **Title** | Example of Solution Reformatting |
26
+ | **Description** | This example demonstrates the reformatting of a solution for finding the foci of an ellipse. The original solution is generated by Qwen2-7B-Instruct, and the reformatted version is presented for clarity. |
27
+ | **Problem Statement**| The ellipse \(\frac{(x-6)^{2}}{25}+\frac{(y-3)^{2}}{9}=1\) has two foci. Find the one with the larger x-coordinate. Enter your answer as an ordered pair, like \((2,1)\). |
28
+ | **Original Solution**| The original solution is provided in a less readable format, with some syntax errors and unclear notation. |
29
+ | **Reformatted Solution** | The reformatted solution clearly explains the steps: <br> 1. Identify the center of the ellipse \((6,3)\). <br> 2. Calculate the distance from the center to each focus using \(\sqrt{a^2 - b^2}\). <br> 3. Determine the foci locations at \((2,3)\) and \((10,3)\). <br> 4. Identify the focus with the larger x-coordinate as \((10,3)\). |