Update README.md
Browse files
README.md
CHANGED
|
@@ -18,3 +18,12 @@ base_model:
|
|
| 18 |
|
| 19 |
PRM-Math-7B-Reasoner is a fully reproducible model, fine-tuned on the Qwen2.5-Math-7B-PRM800K dataset, designed to evaluate its ability to identify erroneous steps in mathematical reasoning. The model is used for reward computation, where after each step, a special token "<extra_0>" is inserted. For reward calculation, the probability score of this token being classified as positive is extracted, resulting in a reward value between 0 and 1. It is primarily utilized for solution reformatting in mathematically driven tasks and as a Long Context Full Reasoner.
|
| 20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
PRM-Math-7B-Reasoner is a fully reproducible model, fine-tuned on the Qwen2.5-Math-7B-PRM800K dataset, designed to evaluate its ability to identify erroneous steps in mathematical reasoning. The model is used for reward computation, where after each step, a special token "<extra_0>" is inserted. For reward calculation, the probability score of this token being classified as positive is extracted, resulting in a reward value between 0 and 1. It is primarily utilized for solution reformatting in mathematically driven tasks and as a Long Context Full Reasoner.
|
| 20 |
|
| 21 |
+
# Reformatting Reasoning Intermediate
|
| 22 |
+
|
| 23 |
+
| **Section** | **Content** |
|
| 24 |
+
|-----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
| 25 |
+
| **Title** | Example of Solution Reformatting |
|
| 26 |
+
| **Description** | This example demonstrates the reformatting of a solution for finding the foci of an ellipse. The original solution is generated by Qwen2-7B-Instruct, and the reformatted version is presented for clarity. |
|
| 27 |
+
| **Problem Statement**| The ellipse \(\frac{(x-6)^{2}}{25}+\frac{(y-3)^{2}}{9}=1\) has two foci. Find the one with the larger x-coordinate. Enter your answer as an ordered pair, like \((2,1)\). |
|
| 28 |
+
| **Original Solution**| The original solution is provided in a less readable format, with some syntax errors and unclear notation. |
|
| 29 |
+
| **Reformatted Solution** | The reformatted solution clearly explains the steps: <br> 1. Identify the center of the ellipse \((6,3)\). <br> 2. Calculate the distance from the center to each focus using \(\sqrt{a^2 - b^2}\). <br> 3. Determine the foci locations at \((2,3)\) and \((10,3)\). <br> 4. Identify the focus with the larger x-coordinate as \((10,3)\). |
|