Update README.md
Browse files
README.md
CHANGED
|
@@ -14,19 +14,31 @@ base_model:
|
|
| 14 |
|
| 15 |
# Qwen3-0.6B-Math
|
| 16 |
|
| 17 |
-
This model is obtained by fine-tuning Qwen/Qwen3-0.6B on the gsm8k train split.
|
| 18 |
Single A100 was used for fine-tuning and evaluation.
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
## Training
|
| 21 |
|
| 22 |
The [TRL](https://github.com/huggingface/trl) library was used with SFT/full-rank options:
|
| 23 |
|
| 24 |
```bash
|
| 25 |
python trl/scripts/sft.py --model_name_or_path Qwen/Qwen3-0.6B --dataset_name openai/gsm8k --dataset_config main --learning_rate 2e-5 \
|
| 26 |
-
--num_train_epochs 1 --per_device_train_batch_size 2 --gradient_checkpointing --eos_token '<|im_end|>' --eval_strategy steps \
|
| 27 |
--eval_steps 100 --completion_only_loss True --report_to wandb --output_dir /path/to/the/finetuned/model
|
| 28 |
```
|
| 29 |
|
|
|
|
|
|
|
| 30 |
The dataset was preprocessed to the conversational format:
|
| 31 |
|
| 32 |
```python
|
|
|
|
| 14 |
|
| 15 |
# Qwen3-0.6B-Math
|
| 16 |
|
| 17 |
+
This model is obtained by fine-tuning Qwen/Qwen3-0.6B on the gsm8k train split. It used in the experiments described in https://bknyaz.github.io/blog/2026/meta-merge/.
|
| 18 |
Single A100 was used for fine-tuning and evaluation.
|
| 19 |
|
| 20 |
+
The following versions were used for train/eval:
|
| 21 |
+
|
| 22 |
+
- python >= 3.10
|
| 23 |
+
- torch : 2.9.0+cu128
|
| 24 |
+
- lm_eval : 0.4.9.1
|
| 25 |
+
- vllm : 0.11.1
|
| 26 |
+
- transformers : 4.57.6
|
| 27 |
+
- datasets : 3.2.0
|
| 28 |
+
- numpy : 2.2.6
|
| 29 |
+
|
| 30 |
## Training
|
| 31 |
|
| 32 |
The [TRL](https://github.com/huggingface/trl) library was used with SFT/full-rank options:
|
| 33 |
|
| 34 |
```bash
|
| 35 |
python trl/scripts/sft.py --model_name_or_path Qwen/Qwen3-0.6B --dataset_name openai/gsm8k --dataset_config main --learning_rate 2e-5 \
|
| 36 |
+
--num_train_epochs 1 --per_device_train_batch_size 2 --gradient_accumulation_steps 8 --gradient_checkpointing --eos_token '<|im_end|>' --eval_strategy steps \
|
| 37 |
--eval_steps 100 --completion_only_loss True --report_to wandb --output_dir /path/to/the/finetuned/model
|
| 38 |
```
|
| 39 |
|
| 40 |
+
This is by far not the most compute and performance efficient fine-tuning, but it could be a good baseline.
|
| 41 |
+
|
| 42 |
The dataset was preprocessed to the conversational format:
|
| 43 |
|
| 44 |
```python
|