--- library_name: transformers tags: - small-lm - code - reasoning - slm license: apache-2.0 datasets: - theblackcat102/evol-codealpaca-v1 base_model: - Qwen/Qwen3-0.6B --- # Qwen3-0.6B-Code This model is obtained by fine-tuning Qwen/Qwen3-0.6B on the [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) train split. The model is used in the experiments described in https://bknyaz.github.io/blog/2026/meta-merge/. Single A100 was used for fine-tuning and evaluation. The following versions were used for train/eval: - python >= 3.10 - torch : 2.9.0+cu128 - lm_eval : 0.4.9.1 - vllm : 0.11.1 - transformers : 4.57.6 - datasets : 3.2.0 - numpy : 2.2.6 ## Training The [TRL](https://github.com/huggingface/trl) library was used with SFT/full-rank options: ```bash python trl/scripts/sft.py --model_name_or_path Qwen/Qwen3-0.6B --dataset_name theblackcat102/evol-codealpaca-v1 --learning_rate 2e-5 \ --num_train_epochs 1 --per_device_train_batch_size 2 --gradient_accumulation_steps 8 --gradient_checkpointing --eos_token '<|im_end|>' --eval_strategy no \ --completion_only_loss True --report_to wandb --output_dir /path/to/the/finetuned/model ``` This is by far not the most compute and performance efficient fine-tuning, but it could be a good baseline. The dataset was preprocessed to the conversational format: ```python # trl/scripts/sft.py dataset = load_dataset(...) def preprocess_function(example): return { "prompt": [{"role": "user", "content": example["instruction"]}], "completion": [ {"role": "assistant", "content": example['output']} ], } dataset = dataset.map(preprocess_function) ``` ## Evaluation Evaluation was done with lm_eval on the humaneval (instruct) benchmark: ```bash python -m lm_eval --model vllm --model_args pretrained=${model},tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.9,data_parallel_size=1 \ --tasks humaneval_instruct --batch_size 1 --apply_chat_template=True --confirm_run_unsafe_code --trust_remote_code ``` ### Results | Model | humaneval_instruct | |-----------------------|--------------------| | Qwen3-0.6B | 38.4 | | Qwen3-0.6B-Code | 46.3 | ## License Please refer to the license of the original model [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) and dataset [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1).