Update README.md
Browse files
README.md
CHANGED
|
@@ -10,16 +10,11 @@ pipeline_tag: text-classification
|
|
| 10 |
|
| 11 |
## Model Description
|
| 12 |
|
| 13 |
-
`ReasonEval-34B` is a 34B parameter decoder-only language model fine-tuned from [`llemma_34b`](https://huggingface.co/EleutherAI/llemma_34b).
|
| 14 |
-
|
| 15 |
-
<p align="center">
|
| 16 |
-
<img src="introduction.jpg" alt="error" style="width:95%;">
|
| 17 |
-
</p>
|
| 18 |
-
|
| 19 |
-
`ReasonEval-34B` assesses the problem-solving process in a step-by-step format from the following perspectives:
|
| 20 |
- **Validity**: The step contains no mistakes in calculation and logic.
|
| 21 |
- **Redundancy**: The step lacks utility in solving the problem but is still valid.
|
| 22 |
|
|
|
|
| 23 |
With ReasonEval, you can
|
| 24 |
|
| 25 |
- 📏 quantify the quality of reasoning steps free of human or close-source models.
|
|
@@ -34,12 +29,12 @@ With ReasonEval, you can
|
|
| 34 |
classification head for next-token prediction is replaced with a classification head for outputting the
|
| 35 |
possibilities of each class of reasong steps.
|
| 36 |
* **Language(s)**: English
|
| 37 |
-
* **Paper**: [Evaluating Mathematical Reasoning Beyond Accuracy](
|
| 38 |
* **Github**: [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval)
|
| 39 |
* **Finetuned from model**: [https://huggingface.co/EleutherAI/llemma_34b](https://huggingface.co/EleutherAI/llemma_34b)
|
| 40 |
* **Fine-tuning Data**: [PRM800K](https://github.com/openai/prm800k)
|
| 41 |
|
| 42 |
-
For detailed instructions on how to use the ReasonEval-34B model, visit our GitHub repository at [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval).
|
| 43 |
## How to Cite
|
| 44 |
```bibtex
|
| 45 |
```
|
|
|
|
| 10 |
|
| 11 |
## Model Description
|
| 12 |
|
| 13 |
+
`ReasonEval-34B` is a 34B parameter decoder-only language model fine-tuned from [`llemma_34b`](https://huggingface.co/EleutherAI/llemma_34b). Given a mathematical problem and the solution, `ReasonEval-7B` assesses the problem-solving process in a step-by-step format from the following perspectives:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
- **Validity**: The step contains no mistakes in calculation and logic.
|
| 15 |
- **Redundancy**: The step lacks utility in solving the problem but is still valid.
|
| 16 |
|
| 17 |
+
|
| 18 |
With ReasonEval, you can
|
| 19 |
|
| 20 |
- 📏 quantify the quality of reasoning steps free of human or close-source models.
|
|
|
|
| 29 |
classification head for next-token prediction is replaced with a classification head for outputting the
|
| 30 |
possibilities of each class of reasong steps.
|
| 31 |
* **Language(s)**: English
|
| 32 |
+
* **Paper**: [Evaluating Mathematical Reasoning Beyond Accuracy]()
|
| 33 |
* **Github**: [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval)
|
| 34 |
* **Finetuned from model**: [https://huggingface.co/EleutherAI/llemma_34b](https://huggingface.co/EleutherAI/llemma_34b)
|
| 35 |
* **Fine-tuning Data**: [PRM800K](https://github.com/openai/prm800k)
|
| 36 |
|
| 37 |
+
For detailed instructions on how to use the ReasonEval-34B model, visit our GitHub repository at [https://github.com/GAIR-NLP/ReasonEval](https://github.com/GAIR-NLP/ReasonEval) and the [paper]() .
|
| 38 |
## How to Cite
|
| 39 |
```bibtex
|
| 40 |
```
|