Instructions to use LLaMAX/LLaMAX2-7B-MetaMath with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use LLaMAX/LLaMAX2-7B-MetaMath with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="LLaMAX/LLaMAX2-7B-MetaMath")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("LLaMAX/LLaMAX2-7B-MetaMath")
model = AutoModelForCausalLM.from_pretrained("LLaMAX/LLaMAX2-7B-MetaMath")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use LLaMAX/LLaMAX2-7B-MetaMath with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "LLaMAX/LLaMAX2-7B-MetaMath"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX2-7B-MetaMath",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/LLaMAX/LLaMAX2-7B-MetaMath

SGLang

How to use LLaMAX/LLaMAX2-7B-MetaMath with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "LLaMAX/LLaMAX2-7B-MetaMath" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX2-7B-MetaMath",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "LLaMAX/LLaMAX2-7B-MetaMath" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LLaMAX/LLaMAX2-7B-MetaMath",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use LLaMAX/LLaMAX2-7B-MetaMath with Docker Model Runner:
```
docker model run hf.co/LLaMAX/LLaMAX2-7B-MetaMath
```

huangtao6 commited on Jul 9, 2024

Commit

29d7d78

1 Parent(s): f896060

update readme

Browse files

Files changed (1) hide show

README.md +13 -16

README.md CHANGED Viewed

@@ -1,16 +1,16 @@
 ### Model Sources
-**Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
-Link: https://arxiv.org/pdf/2407
 ### Model Description
-🔥 LLaMAX-7B-MetaMath is fully fine-tuned on the MetaMathQA dataset based on the powerful multilingual model LLaMAX-7B.
-🔥 Compared with the [MetaMath-7B](https://huggingface.co/meta-math/MetaMath-7B-V1.0), LLaMAX-7B-MetaMath performs significantly better in mathematical reasoning in low-resource languages, improving the average accuracy of low-resource languages on MGSM dataset by up to 18.8%.
-🔥 LLaMAX-7B-MetaMath demonstrates good multilingual math reasoning capability in all languages, improving the average accuracy by 6.2% across all languages in MGSM dataset.
 ### Model Usage
@@ -46,20 +46,17 @@ the total number of words (1050) by the number of days in two weeks (14). So, th
 1050/14 = 75 words in each daily crossword puzzle on average. #### The answer is: 75“
 ```
 ### Experiments
-We evaluated LLaMAX-7B-MetaMath on the MGSM dataset. Compared with MetaMath-7B, LLaMAX-7B-MetaMath achieves a leading on both high-resource languages (Hrl.) and low-resource languages (Lrl.).
-| MGSM                        | Bn    | Th   | Sw | Ja    | Zh   | De | Fr | Ru   | Es | En | Lrl. | Hrl. | Avg.   |
-|-----------------------------|-------|------|----|-------|------|----|----|------|----|----|------|------|--------|
-| MetaMath-7B (official)   | 	6.8	 | 7.2  |6.8| 36.4  | 38.4 | 55.2|54.4| 52.0 |57.2|68.8| 6.9  | 51.8 | 38.32  |
-| MetaMath-7B (Reproduced) | 6.0   | 10.0 |4.4|36.4|42.8|52.8|56.0|48.8|58.8|64.8| 6.8  | 51.5 | 38.08  |
-| LLaMAX-7B-MetaMath     |26.8| 24.0 |26.0|35.6|42.4|56.8|55.2|53.6|56.8|65.6| 25.6 | 52.3 |  44.28 |
 ### Citation
 if our model helps your work, please cite this paper:
 ```
-@inproceedings{Huang2024MindMergerEB,
-  title={XLLaMA2: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
-  year={2024},
-}
 ```

 ### Model Sources
+- **Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
+- **Link**:
+- **Repository**: https://github.com/CONE-MT/LLaMAX/
 ### Model Description
+🔥 LLaMAX2-7B-MetaMath is fully fine-tuned on the MetaMathQA dataset based on the powerful multilingual model LLaMAX2-7B.
+🔥 Compared with the [MetaMath-7B](https://huggingface.co/meta-math/MetaMath-7B-V1.0), LLaMAX2-7B-MetaMath performs significantly better in mathematical reasoning in low-resource languages, improving the average accuracy of low-resource languages on MGSM dataset by up to 18.8%.
+🔥 LLaMAX2-7B-MetaMath demonstrates good multilingual math reasoning capability in all languages, improving the average accuracy by 6.2% across all languages in MGSM dataset.
 ### Model Usage
 1050/14 = 75 words in each daily crossword puzzle on average. #### The answer is: 75“
 ```
 ### Experiments
+We evaluated LLaMAX2-7B-MetaMath on the MGSM dataset. Compared with MetaMath-7B, LLaMAX-7B-MetaMath achieves a leading on both high-resource languages (Hrl.) and low-resource languages (Lrl.).
+| MGSM                      | Avg.    | Lrl. | Hrl.   | Bn     | Th   | Sw | Ja    | Zh   | De | Fr | Ru   | Es | En |
+|---------------------------|---------|------|--------|--------|------|----|----|------|----|----|------|------|--------|
+| MetaMath-7B (official)    | 38.32   | 6.9  | 51.8   | 6.8	   | 7.2  |6.8| 36.4 | 38.4 | 55.2|54.4| 52.0 |57.2|68.8|
+| MetaMath-7B (Reproduced)  | 38.08   | 6.8  | 51.5   | 6.0    | 10.0 |4.4| 36.4 |42.8|52.8|56.0|48.8|58.8|64.8|
+| LLaMAX2-7B-MetaMath       | 44.28   | 25.6 | 52.3   | 26.8   | 24.0 |26.0| 35.6 |42.4|56.8|55.2|53.6|56.8|65.6|
 ### Citation
 if our model helps your work, please cite this paper:
 ```
 ```