seven-cat commited on
Commit
b0b1f62
·
verified ·
1 Parent(s): 61f69d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -44
README.md CHANGED
@@ -10,7 +10,13 @@ language:
10
 
11
  ## Model Description
12
 
13
- `ReasonEval-7B` is a 7.1B parameter decoder-only language model tuned from [`WizardMath-7B-V1.1`](https://huggingface.co/WizardLM/WizardMath-7B-V1.1). `ReasonEval-7B` assesses the problem-solving process in a step-by-step format from the following perspectives:
 
 
 
 
 
 
14
  - **Validity**: The step contains no mistakes in calculation and logic.
15
  - **Redundancy**: The step lacks utility in solving the problem but is still valid.
16
 
@@ -31,51 +37,9 @@ possibilities of each class of reasong steps.
31
  * **Paper**: [Evaluating Mathematical Reasoning Beyond Accuracy](https://drive.google.com/file/d/1Lw1uGFzTUWxo3mB91sfdusSrxnCCO9mR/view?usp=sharing)
32
  * **Github**: [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval)
33
  * **Finetuned from model**: [`https://huggingface.co/WizardLM/WizardMath-7B-V1.1`](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
 
34
 
35
- ## Quick Start
36
- ### Setup
37
-
38
- * Clone the repository
39
- ```bash
40
- git clone https://github.com/GAIR-NLP/ReasonEval
41
- cd ReasonEval
42
- ```
43
- * Create a conda environment and activate the environment
44
- ```bash
45
- conda create -n ReasonEval python=3.10
46
- conda activate ReasonEval
47
- ```
48
- * Install the required libraries
49
- ```bash
50
- pip install -r requirements.txt
51
- ```
52
- ### Usage
53
-
54
- Provide the question and the solution in a step-by-step format.
55
-
56
- ```python
57
- # examples
58
- question = "Let $x,$ $y,$ and $z$ be positive real numbers such that $xyz(x + y + z) = 1.$ Find the minimum value of\n\\[(x + y)(y + z).\\]"
59
- reasoning_steps = ["1. The problem asks us to find the minimum value of $(x + y)(y + z)$ given that $x,$ $y,$ and $z$ are positive real numbers and $xyz(x + y + z) = 1$.",
60
- "2. By the AM-GM inequality, we have $x + y + z \\geq 3\\sqrt[3]{xyz}$.",
61
- "3. By the given condition $xyz(x + y + z) = 1$, we can substitute $x + y + z$ with $\\sqrt[3]{xyz}$ in the inequality from step 2 to get $3\\sqrt[3]{xyz} \\geq 3$.",
62
- "4. Simplifying the inequality from step 3 gives $\\sqrt[3]{xyz} \\geq 1$.",
63
- "5. By raising both sides of the inequality from step 4 to the power of 3, we have $xyz \\geq 1$.",
64
- "6. By the AM-GM inequality, we have $(x + y)(y + z) \\geq 2\\sqrt{(x + y)(y + z)}$.",
65
- "7. By the given condition $xyz(x + y + z) = 1$, we can substitute $(x + y)(y + z)$ with $\\frac{1}{xyz}$ in the inequality from step 6 to get $2\\sqrt{(x + y)(y + z)} \\geq 2\\sqrt{\\frac{1}{xyz}}$.",
66
- "8. Simplifying the inequality from step 7 gives $(x + y)(y + z) \\geq \\frac{2}{\\sqrt{xyz}}$.",
67
- "9. By the condition $xyz \\geq 1$ from step 5, we have $\\frac{2}{\\sqrt{xyz}} \\geq \\frac{2}{\\sqrt{1}} = 2$.",
68
- "10. Therefore, the minimum value of $(x + y)(y + z)$ is $\\boxed{2}$."]
69
- ```
70
 
71
- Run `./codes/examples.py` to get the validity and redundancy scores for each step.
72
- ```bash
73
- # examples
74
- ## Replace the 'question' and 'reasoning_steps' in ./codes/examples.py with your own content.
75
- python ./codes/examples.py
76
- --model_name_or_path GAIR/ReasonEval-7B # Specify the model name or path here
77
- --model_size 7B # Indicate the model size of ReasonEval (7B or 34B)
78
- ```
79
  ## How to Cite
80
  ```bibtex
81
  ```
 
10
 
11
  ## Model Description
12
 
13
+ `ReasonEval-7B` is a 7.1B parameter decoder-only language model tuned from [`WizardMath-7B-V1.1`](https://huggingface.co/WizardLM/WizardMath-7B-V1.1).
14
+
15
+ <p align="center">
16
+ <img src="introduction.jpg" alt="error" style="width:95%;">
17
+ </p>
18
+
19
+ `ReasonEval-7B` assesses the problem-solving process in a step-by-step format from the following perspectives:
20
  - **Validity**: The step contains no mistakes in calculation and logic.
21
  - **Redundancy**: The step lacks utility in solving the problem but is still valid.
22
 
 
37
  * **Paper**: [Evaluating Mathematical Reasoning Beyond Accuracy](https://drive.google.com/file/d/1Lw1uGFzTUWxo3mB91sfdusSrxnCCO9mR/view?usp=sharing)
38
  * **Github**: [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval)
39
  * **Finetuned from model**: [`https://huggingface.co/WizardLM/WizardMath-7B-V1.1`](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
40
+ * **Fine-tuning Data**: [`https://huggingface.co/WizardLM/WizardMath-7B-V1.1`](https://huggingface.co/WizardLM/WizardMath-7B-V1.1)
41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
 
 
 
 
 
 
 
 
43
  ## How to Cite
44
  ```bibtex
45
  ```