Update README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ pipeline_tag: text-generation
|
|
| 15 |
<img src="prox-teaser.png">
|
| 16 |
</p>
|
| 17 |
|
| 18 |
-
[ArXiv](http://arxiv.org/abs/
|
| 19 |
|
| 20 |
**Mistral-7B-ProXMath** is a math-adapted Mistral-7B-v0.1 model that is continually pre-trained on [OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) (a refined version by ProX) for **10**B tokens.
|
| 21 |
|
|
@@ -33,6 +33,10 @@ ProX models are evaluated on 9 common math reasoning benchmarks.
|
|
| 33 |
|
| 34 |
### Citation
|
| 35 |
```
|
| 36 |
-
@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
}
|
| 38 |
```
|
|
|
|
| 15 |
<img src="prox-teaser.png">
|
| 16 |
</p>
|
| 17 |
|
| 18 |
+
[ArXiv](http://arxiv.org/abs/2409.17115) | [Data: OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) | [Code](https://github.com/GAIR-NLP/program-every-example)
|
| 19 |
|
| 20 |
**Mistral-7B-ProXMath** is a math-adapted Mistral-7B-v0.1 model that is continually pre-trained on [OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) (a refined version by ProX) for **10**B tokens.
|
| 21 |
|
|
|
|
| 33 |
|
| 34 |
### Citation
|
| 35 |
```
|
| 36 |
+
@article{zhou2024programming,
|
| 37 |
+
title={Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale},
|
| 38 |
+
author={Zhou, Fan and Wang, Zengzhi and Liu, Qian and Li, Junlong and Liu, Pengfei},
|
| 39 |
+
journal={arXiv preprint arXiv:2409.17115},
|
| 40 |
+
year={2024}
|
| 41 |
}
|
| 42 |
```
|