Instructions to use bond005/meno-tiny-0.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bond005/meno-tiny-0.1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="bond005/meno-tiny-0.1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("bond005/meno-tiny-0.1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use bond005/meno-tiny-0.1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "bond005/meno-tiny-0.1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bond005/meno-tiny-0.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/bond005/meno-tiny-0.1
- SGLang
How to use bond005/meno-tiny-0.1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "bond005/meno-tiny-0.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bond005/meno-tiny-0.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "bond005/meno-tiny-0.1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bond005/meno-tiny-0.1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use bond005/meno-tiny-0.1 with Docker Model Runner:
docker model run hf.co/bond005/meno-tiny-0.1
contamination...OpenMathInstruct-2
@bond005
I believe that OpenMathInstruct-2 is part of the training for this model, which unfortunately seems to be contaminated.
Indistinctly of which dataset is part of your training, the model weights contamination is a fact. But TBH, the numbers & samples involved are the same as the previous case
According contamination benchmarks:
- 200~ MATH tests were EXTRA contaminated
- 35~ MATH_HARD tests were EXTRA contaminated
Contamination tests for base model:
MATH_rewritten-test-1 5_gram_accuracy: 0.25320000000000004
MATH_rewritten-test-2 5_gram_accuracy: 0.2690666666666667
MATH_rewritten-test-3 5_gram_accuracy: 0.2692
orgn-MATH-test 5_gram_accuracy: 0.27053333333333335
ngram acc of Qwen2.5-1.5B-Instruct
MATH_rewritten-test-1: 0.25320000000000004
MATH_rewritten-test-2: 0.2690666666666667
MATH_rewritten-test-3: 0.2692
orgn-MATH-test: 0.27053333333333335
...
GSM8K_rewritten-test-1 5_gram_accuracy: 0.21971190295678544
GSM8K_rewritten-test-2 5_gram_accuracy: 0.2227445034116755
GSM8K_rewritten-test-3 5_gram_accuracy: 0.2172858225928734
orgn-GSM8K-test 5_gram_accuracy: 0.23290371493555728
GSM8K_rewritten-test-1: 0.21971190295678544
GSM8K_rewritten-test-2: 0.2227445034116755
GSM8K_rewritten-test-3: 0.2172858225928734
orgn-GSM8K-test: 0.23290371493555728
Contamination tests for this model:
MATH_rewritten-test-1 5_gram_accuracy: 0.3384666666666667
MATH_rewritten-test-2 5_gram_accuracy: 0.3502666666666667
MATH_rewritten-test-3 5_gram_accuracy: 0.3504666666666667
orgn-MATH-test 5_gram_accuracy: 0.3519333333333334
ngram acc of meno-tiny-0.1
MATH_rewritten-test-1: 0.3384666666666667
MATH_rewritten-test-2: 0.3502666666666667
MATH_rewritten-test-3: 0.3504666666666667
orgn-MATH-test: 0.3519333333333334
...
GSM8K_rewritten-test-1 5_gram_accuracy: 0.23320697498104626
GSM8K_rewritten-test-2 5_gram_accuracy: 0.2400303260045489
GSM8K_rewritten-test-3 5_gram_accuracy: 0.23290371493555728
orgn-GSM8K-test 5_gram_accuracy: 0.26277482941622443
GSM8K_rewritten-test-1: 0.23320697498104626
GSM8K_rewritten-test-2: 0.2400303260045489
GSM8K_rewritten-test-3: 0.23290371493555728
orgn-GSM8K-test: 0.26277482941622443
The reproduction is simple:
- https://github.com/GAIR-NLP/benbench
- modify the src/script to use the model, and the test to be
mathorgsm8k - run and get the results
Hi!
Thank you for your comment. However, I didn't use nvidia/OpenMathInstruct-2 for training.
My training dataset consisted of many separate datasets in Russian and English, which can be divided into three groups:
Fully synthetic datasets generated by me using a large model.
Datasets automatically translated from English to Russian, focused on solving mathematical and logical problems.
Russian-language datasets obtained based on NLP tasks for the Russian language (paraphrasing, summarization, etc.).
In the second group, there were the TIGER-Lab/MathInstruct and KK04/LogicInference_OA datasets, which I translated into Russian using NLLB-200-3.3B, followed by automated error checking and translation hallucination detection. However, the TIGER-Lab/MathInstruct dataset and the nvidia/OpenMathInstruct-2 dataset are different datasets, as far as I understand, even though they belong to the same subject area.
Indistinctly of which dataset is part of your training, the model weights contamination is a fact.
это получается для подведения и проверки модели на точность ?