bknyaz commited on
Commit
80e33c1
·
verified ·
1 Parent(s): d5d8db7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -2
README.md CHANGED
@@ -14,19 +14,31 @@ base_model:
14
 
15
  # Qwen3-0.6B-Math
16
 
17
- This model is obtained by fine-tuning Qwen/Qwen3-0.6B on the gsm8k train split.
18
  Single A100 was used for fine-tuning and evaluation.
19
 
 
 
 
 
 
 
 
 
 
 
20
  ## Training
21
 
22
  The [TRL](https://github.com/huggingface/trl) library was used with SFT/full-rank options:
23
 
24
  ```bash
25
  python trl/scripts/sft.py --model_name_or_path Qwen/Qwen3-0.6B --dataset_name openai/gsm8k --dataset_config main --learning_rate 2e-5 \
26
- --num_train_epochs 1 --per_device_train_batch_size 2 --gradient_checkpointing --eos_token '<|im_end|>' --eval_strategy steps \
27
  --eval_steps 100 --completion_only_loss True --report_to wandb --output_dir /path/to/the/finetuned/model
28
  ```
29
 
 
 
30
  The dataset was preprocessed to the conversational format:
31
 
32
  ```python
 
14
 
15
  # Qwen3-0.6B-Math
16
 
17
+ This model is obtained by fine-tuning Qwen/Qwen3-0.6B on the gsm8k train split. It used in the experiments described in https://bknyaz.github.io/blog/2026/meta-merge/.
18
  Single A100 was used for fine-tuning and evaluation.
19
 
20
+ The following versions were used for train/eval:
21
+
22
+ - python >= 3.10
23
+ - torch : 2.9.0+cu128
24
+ - lm_eval : 0.4.9.1
25
+ - vllm : 0.11.1
26
+ - transformers : 4.57.6
27
+ - datasets : 3.2.0
28
+ - numpy : 2.2.6
29
+
30
  ## Training
31
 
32
  The [TRL](https://github.com/huggingface/trl) library was used with SFT/full-rank options:
33
 
34
  ```bash
35
  python trl/scripts/sft.py --model_name_or_path Qwen/Qwen3-0.6B --dataset_name openai/gsm8k --dataset_config main --learning_rate 2e-5 \
36
+ --num_train_epochs 1 --per_device_train_batch_size 2 --gradient_accumulation_steps 8 --gradient_checkpointing --eos_token '<|im_end|>' --eval_strategy steps \
37
  --eval_steps 100 --completion_only_loss True --report_to wandb --output_dir /path/to/the/finetuned/model
38
  ```
39
 
40
+ This is by far not the most compute and performance efficient fine-tuning, but it could be a good baseline.
41
+
42
  The dataset was preprocessed to the conversational format:
43
 
44
  ```python