Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,8 @@ base_model:
|
|
| 14 |
|
| 15 |
# Qwen3-0.6B-Math
|
| 16 |
|
| 17 |
-
This model is obtained by fine-tuning Qwen/Qwen3-0.6B on the gsm8k
|
|
|
|
| 18 |
Single A100 was used for fine-tuning and evaluation.
|
| 19 |
|
| 20 |
The following versions were used for train/eval:
|
|
@@ -75,4 +76,4 @@ python -m lm_eval --model vllm --model_args pretrained=${model},tensor_parallel_
|
|
| 75 |
|
| 76 |
## License
|
| 77 |
|
| 78 |
-
Please refer to the license of the original model [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B).
|
|
|
|
| 14 |
|
| 15 |
# Qwen3-0.6B-Math
|
| 16 |
|
| 17 |
+
This model is obtained by fine-tuning Qwen/Qwen3-0.6B on the [gsm8k](https://huggingface.co/datasets/openai/gsm8k) train split.
|
| 18 |
+
The model is used in the experiments described in https://bknyaz.github.io/blog/2026/meta-merge/.
|
| 19 |
Single A100 was used for fine-tuning and evaluation.
|
| 20 |
|
| 21 |
The following versions were used for train/eval:
|
|
|
|
| 76 |
|
| 77 |
## License
|
| 78 |
|
| 79 |
+
Please refer to the license of the original model [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) and dataset [gsm8k](https://huggingface.co/datasets/openai/gsm8k).
|