| --- |
| license: llama2 |
| datasets: |
| - gair-prox/open-web-math-pro |
| language: |
| - en |
| base_model: |
| - codellama/CodeLlama-7b-hf |
| --- |
| |
|
|
|
|
| # CodeLlama-7B-ProXMath |
|
|
| <p align="center"> |
| <img src="prox-teaser.png"> |
| </p> |
|
|
| [ArXiv](http://arxiv.org/abs/xxxx) | [Data: OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) | [Code](https://github.com/GAIR-NLP/program-every-example) |
|
|
| **CodeLlama-7B-ProXMath** is a language model that is continually pre-trained on [OpenWebMath-Pro](https://huggingface.co/datasets/gair-prox/open-web-math-pro) (a refined version by ProX) for 10B tokens. |
|
|
| ## Evaluations |
|
|
| ProX models are evaluated over 10 language model benchmarks in zero-shot setting. |
|
|
| | | ArC-c | ARC-e | CSQA | HellaS | MMLU | OBQA | PiQA | SIQA | WinoG | SciQ | AVG | |
| |-----------------------|-------|-------|-------|-----------|-------|-------|-------|-------|-------|-------|------| |
| | raw | 26.1 | 44.3 | 29.7 | 39.1 | 27.3 | 29.2 | 66.9 | 39.0 | 52.0 | 67.4 | 42.1 | |
| | ours | 26.4 | 51.9 | 30.9 | 42.4 | 29.4 | 31.6 | 67.9 | 40.0 | 52.2 | 73.5 | 44.6 | |
|
|
| ### Citation |
| ``` |
| @misc{TBD |
| } |
| ``` |
|
|